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Disclosed is a series of nucleic acid probes for use in diagnosing and monitoring certain types of leukemia using, e.g.. 
Southern and Northern blot analyses and fluorescence in situ hybridization (FISH). These probes detect rearrangements, such as 
translocations involving chromosome band 1 lq23 with other chromosomes bands, including 4q21, 6q27, 9p22, 19p 1 3.3, in both 
dividing leukemic cells and interphase nuclei. The breakpoints in all such translocations are clustered within an 8.3 kb BamH\ 
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Southern blot analysis with a single BamHX restriction digest in all patients with the common 1 lq23 translocations and in pa- 
tients with other llq23 anomalies. Northern blot analyses are presented demonstrating that the MLL gene has multiple tran- 
scripts and that transcript size differentiates leukemic cells from normal cells. Also disclosed are MLL fusion proteins, MLL pro- 
tein domains and anit-MLL antibodies. 
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DESCRIPTION 

COMPOSITIONS AND METHODS FOR DETECTING 
GENE REARRANGEMENTS AND TRANSLOCATIONS 

5 

BACKGROUND OF THE INVENTION 

This application is a continuation-in-part of 
copending application, USSN 07/991,224, filed December 

10 16, 1992, which was a continuation-in-part of USSN 

07/900,689, filed June 17, 1992, The entire text of each 
of the above-referenced disclosures is specifically 
incorporated by reference herein without disclaimer. 

The government owns rights in the present invention 

15 pursuant to grants CA42557, CA40046, CA38725, CA34775, 
5T32 CA09566 and 5T32 CA09273-12 from the National 
Institutes of Health and DE-FG02-86ER60408 from the 
Department of Energy. 

20 1. Field of the Invention 

The present invention relates generally to the 
diagnosis of cancer. The invention concerns the creation 
of probes for use in diagnosing and monitoring certain 

25 genetic abnormalities, including those found in leukemia 
and lymphoma, using molecular biological hybridization 
techniques. In particular, it concerns the localization 
of the translocation breakpoint on the MLL gene, the 
identification of nucleic acid probes capable of 

3 0 detecting rearrangements in all patients with the common 
llq23 translocations and the identification of MLL mRNA 
transcripts characteristic of leukemic cells. MLL fusion 
proteins and anti-MLL antibodies are also disclosed. 



WO 93/25713 



PCT/US93/05857 



-2- 

2 . Description of the Related Art 

The etiology of a substantial portion of human 
diseases lies, at least in part, with genetic factors. 
5 The identification and detection of genetic factors 
associated with particular diseases or malf ormations 
provides a means for diagnosis and for planning the most 
effective course of treatment. For some conditions, 
early detection may allow prevention or amelioration of 
10 the devastating courses of the particular disease. 

The genetic material of an organism is located 
within one or more microscopically visible entities 
termed chromosomes. In higher organisms, such as man, 

15 chromosomes contain the genetic material DNA and also 
contain various proteins and RNA. The study of 
chromosomes, termed cytogenetics, is often an important 
aspect of disease diagnosis. One class of genetic 
factors which lead to various disease states are 

20 chromosomal aberrations, i.e., deviations in the expected 
number and/ or structure of chromosomes for a particular 
species or for certain cell types within a species. 

There are several classes of structural aberrations 
25 which may involve either the autosomal or sex 

chromosomes, or a combination of both. Such aberrations 
may be detected by noting changes in chromosome 
morphology, as evidenced by band patterns, in one or more 
chromosomes. Normal phenotypes may be associated with 
3 0 rearrangements if the amount of genetic material has not 
been altered, however, physical or mental anomalies 
result from chromosomal rearrangements where there has 
been a gain or loss of genetic material. Deletions, or 
deficiencies, refer to loss of part of a chromosome, 
35 whereas duplication refers to addition of material to 
chromosomes. Duplication and deficiency of genetic 
material can be produced by breakage of chromosomes, by 
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errors during DNA synthesis, or as a consequence of 
segregation of other rearrangements into gametes* 

Translocations are interchromosomal rearrangements 
5 effected by breakage and transfer of part of chromosomes 
to different locations. In reciprocal translocations, 
pieces of chromosomes are exchanged between two or more 
chromosomes. Generally, the exchanges of interest are 
between non-homologous chromosomes. If all the original 
10 genetic material appears to be preserved, this condition 
is referred to as balanced. Unbalanced forms have 
duplications or deficiencies of genetic material 
associated with the exchange; that is, some material has 
been gained or lost in the process. 



One of the most interesting associations between 
chromosomal aberrations and human disease is that between 
chromosomal aberrations and cancer. Non-random 
translocations involving chromosome 11 band q23 occur 

2 0 frequently in both myeloid and lymphoblastic leukemias 
(Rowley, 1990b; Heim & Mitelman, 1987). The four most 
common reciprocal translocations are t(4;ll) and 
t(ll;19) , which exhibit mainly lymphoblastic markers and 
sometimes monocytic markers, or both lymphoblastic and 

25 monoblastic markers; and t(6;ll) and t(9;ll), which are 

mainly found in monoblastic and/or myeloblastic leukemias 
(Mitelman et al . , 1991). Other chromosomes which are 
involved in recurring translocations with this band in 
acute leukemias are chromosomes X, 1, 2, 10, and 17. 



The present inventors have previously demonstrated, 
by fluorescence in situ hybridization (FISH) , that a 
yeast artificial chromosome (YAC) containing the CD3D and 
CD3G genes was split in cells with the four most common 
35 translocations (Rowley et al . , 1990). Further studies 
led the inventors to the identification of the gene 
located at the breakpoint, which was named MLL for mixed 
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lineage leukemia or myeloid/ lymphoid leukemia (Ziemin-van 
Der Poel et al . , 1991)* The MLL gene has also been 
independently termed ALL-1 (Cimino et al . , 1991; Gu et 
al. f 1992a; b) , Htrx (Djabali et al . ;, 1992) and HKX 
5 (Tkachuk et al . , 1992). The present inventors 

differentiated' the more centromeric MLL rearrangements 
from the more telomeric breakpoint translocations which 
involve the RCK locus (Akao et al . , 1991b) or the p54 
gene (Lu & Yunis, 1992) . 

10 

From the same YAC clone as described by the present 
inventors (Rowley et al . , 1990), a DNA fragment was 
obtained which allowed the detection of rearrangements in 
leukemic cells from certain patients (Cimino et al ♦ , 
15 1991; 1992), This 0.7 kilobase Ddel fragment allowed 

detection of rearrangements in a 5.8 kilobase region in 6 
of 7 patients with the t(4;ll), 4 of 5 with t(9;ll), and 
3 of 4 with the t(ll;19) translocations (Cimino et al . , 
1992) . Combining these results with those from a 

2 0 subsequent series including an additional 14 patients, 

the DdsX fragment probe was found to detect 
rearrangements in 2 6 of 3 0 cases with t(4;ll), t(9;ll) 
and t(ll;19) translocations (Cimino et al . , 1991; 1992), 
which represents an overall detection rate of 87%. 
25 Despite this partial success, the failure of the Ddel 
probe to detect all rearrangements is a significant 
drawback to its use in clinical diagnosis. 

Accordingly, prior to the present invention, there 

3 0 remained a particular need for the identification of 

nucleic acid fragments or probes capable of detecting 
leukemic cells from all patients with the common llq23 
translocations. The creation of such probes which may be 
used in both Southern blot analyses and in FISH with 
3 5 either dividing leukemic cells or interphase nuclei would 
be particularly important. The elucidation of further 
information regarding the MLL gene, such as further 
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sequence data and information regarding transcription 
into mRNA, would also be advantageous, as would the 
identification of nucleic acid fragments capable of 
differentiating MLL mRNA transcripts from normal and 
5 leukemic cells. 



SUMMARY OF THE INVENTION 

10 The present invention seeks to overcome these and 

other drawbacks inherent in the prior art by providing 
improved compositions and methods for the diagnosis, and 
continued monitoring, of various types of leukemias, 
particularly myeloid and lymphoid leukemia, and lymphomas 

15 in humans. This invention particularly provides novel 
and improved probes for use in genetic analyses, for 
example, in Southern and Northern blotting and in 
fluorescence in situ hybridization (FISH) using either 
dividing leukemic cells or interphase nuclei. 

20 

The inventors first localized the translocation 
breakpoint on the MLL gene to within an estimated 9 kb* 
BamHI genomic region of the MLL gene, and later sequenced 
this region and found it to be 8.3 kb in size. They have 

2 5 further identified short nucleic acid probes, as 

exemplified by a breakpoint-spanning 0.7 kb BamKl cDNA 
fragment, which detect rearrangements on Southern blot 
analysis of singly-digested DNA in all patients with the 
common llq23 translocations, namely t(4;ll), t(6;ll), 

30 t(9;ll), and t(ll;19), and also in certain patients with 
other rare llq2 3 anomalies. The use of this novel 
nucleic acid probe represents a significant advantage 
over previously described probes which allowed the 
molecular diagnosis of leukemia only in certain cases of 

35 common llq23 translocations, and not in all cases. 
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The invention also provides probe compositions for 
use in Northern blot analyses and methods for identifying 
leukemic cells from the pattern of MLL mRNA transcripts 
present, which are herein shown to be different in 
5 leukemic cells as opposed to normal cells. 

The present invention generally concerns the 
breakpoint-spanning gene named MLL, and this term is used 
throughout the present text. MLL is the accepted 
designation for this gene adopted by the human genome 
nomenclature committee (Chromosome Co-ordinating Meeting, 
1992) , however, other terms are also in current use to 
describe the same gene. For example, the terms ALL-1 
(Cimino et al . , 1991, Gu et al . , 1992a; b) , Htrx (Djabali 
et al., 1992) and HRX (Tkachuk et al . , 1992) are also 
currently employed as names for the MLL gene. As these 
terms in fact refer to the same gene, i.e., to the MLL 
gene, each of the foregoing ALL-1 , Htrx and HRX v genes' 
are encompassed by the present invention and are 
described herein, for simplicity, by the single term 
"MLL". 

In certain embodiments, the invention concerns a 
method for detecting leukemic cells containing llq23 
25 chromosome translocations that involve MLL, which method 
comprises obtaining nucleic acids from cells suspected of 
containing a leukemia-associated chromosomal 
rearrangement at chromosome llq23, and probing said 
nucleic acids with a probe capable of differentiating 
3 0 between the nucleic acids from normal cells and the 

nucleic acids from leukemic cells. To "differentiate 
between the nucleic acids from normal cells and the 
nucleic acids from leukemic cells" will generally require 
using a probe, such as those disclosed herein, which 
3 5 allows MLL DNA or RNA from normal cells to be identified 
and differentiated from MLL DNA or RNA from leukemic 
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cells by criteria such as, e.g., number, pattern, size or 
location of the MLL nucleic acids. 

The cells suspected of containing a chromosomal 
5 rearrangement at chromosome llq23 may be cells from cell 
lines or otherwise transformed or cultured cells. 
Alternatively, they may be cells obtained from an 
individual suspected of having a leukemia associated with 
an llq2 3 chromosome translocation, or cells from a 
10 patient known to be presently or previously suffering 
from such a disorder. 



The nucleic acids obtained for analysis may be DNA, 
and preferably, genomic DNA, which may be digested with 

15 one or more restriction enzymes and probed with a nucleic 
acid probe capable of detecting DNA rearrangements from 
leukemic cells containing llq2 3 chromosome 
translocations. Techniques such as these are based upon 
x Southern blotting' and are well known in the art (for 

20 example, see Sambrook et al . (1989), incorporated herein 
by reference) . A large battery of restriction enzymes 
are commercially available and the conditions for 
Southern blotting are described hereinbelow, suitable 
modifications of which will be known to those skilled in 

25 the art of molecular biology. 

Preferred nucleic acid probes for use in Southern 
blotting to detect leukemic cells containing llq23 
chromosome translocations are those probes which include 

3 0 a sequence in accordance with the sequence of a 0.7 kb 

BamHl fragment of the CDNA clone 14P-18B derived from the 
MLL gene, and more preferably, will be the probe MLL 0.7B 
(seq id no:l) itself. The use of this probe is 
particularly advantageous as this fragment encompasses 

35 the breakpoints clustered in the 8.3 kb BamHl genomic 
region (seq id no: 6) of the MLL gene and allows the 
detection of all the common llq2 3 translocations. 
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Moreover, using MLL 0.7B (also simply referred to as 
0.7B) presents the added advantage that DNA may be 
digested with only a single restriction enzyme, namely 
BamHl. Probe MLL 0.7B (seq id no:l) is derived from a 
5 cDNA clone that lacks Exon 8 sequences, but this clearly 
has no adverse effects on breakpoint detection using this 
probe. 

Patients 7 or cultured cells may also be analyzed for 
the presence of llq2 3 chromosome translocations by 
obtaining RNA, and preferably, mRNA, from the cells and 
probing the RNA with a nucleic acid probe capable of 
differentiating between the MLL mRNA species in normal 
and leukemic cells. This differentiation will generally 
involve using a probe capable of identifying normal MLL 
gene transcripts and aberrant MLL gene transcripts, 
wherein a reduction in the amount of a normal MLL gene 
transcript, such as those estimated to be about 12.5 kb, 
12.0 kb or 11.5 kb in length, or the presence of an 
aberrant MLL gene transcript, not detectable in normal 
cells, will be indicative of a cell containing a llq23 
chromosome translocation. Techniques of detecting and 
characterizing mRNA transcripts, based upon Northern 
blotting, are described herein and suitable modifications 
will be known to those of skill in the art (e.g., see 
Sambrook et al . , 1989). 

It is important to note that throughout this text 
the size of certain transcripts quoted are estimated 
3 0 measurements from Northern blot analyses. It is well 
known in the art that agarose gel resolution of RNA 
species of about 9 to 10 kb in size, or greater, leads to 
an approximate size determination, especially with sizes 
of greater than about 10 kb. Hence, size determinations 
3 5 made initially by this technique may later be found to be 
over- or under-estimates of the true size of a given 
transcript. For example, the MLL translocation 
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10 



breakpoint was first localized to an estimated 9 kb BaraHI 
genomic region which the inventors later found, by 
sequencing, to be 8.3. kb in size. It is possible that 
the estimated sizes of the larger mRNA transcripts may 
differ as much as about 2 kb up to about 3 kb from their 
size determined by sequencing, and that the 12.5 kb to 
11 kb size range may be more accurately represented by a 
15 kb to 13 kb size range. This general phenomenon has 
been observed before in regard to the MLL gene itself 
(e.g., Cimino et al . , 1991; 1992). 



Using the probes of this invention, a reduction in 
the amount of MLL gene transcripts estimated to be of 
about 12.5 kb, 12.0 or 11.5 kb in length (or about 15-13 

15 kb) , as compared to the level of such transcripts in 
normal cells, is indicative of cells which contain a. 
Ilq23 chromosome translocation. The size of aberrant MLL 
transcripts will naturally vary between the individual 
cell lines and patients' cells examined, but will 

20 nevertheless always be distinguishable from the size and 
pattern of MLL transcripts identified by the same 
probe (s) in normal cells. 

In RS4;11 cells, the specific rearranged mRNA 
25 transcripts identified as characteristic of leukemic 

cells are estimated to be of about 11.5 kb, 11.25 kb or 
11.0 kb in length, and so an elevation in the levels of 
such transcripts is indicative of a cell containing an 
llq23 chromosome translocation. In the Karpas 45 cell 
30 line (K45 t (X; 11) (ql3 ;q23 ) ) , the aberrant mRNA 

transcripts have estimated sizes of about 8 kb and about 
6 kb, which are therefore another example of transcripts 
characteristic of leukemic cells. In any event, it will 
be clear that using the probes of the present invention 
35 one may differentiate between normal and leukemic cell 

transcripts, and thus identify leukemic cells' in an assay 
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or screening protocol, regardless of the actual size and 
pattern of the aberrant transcripts themselves. 

Probes preferred for use in analyzing mRNA 
transcripts in order to identify cells with an Hq23 
chromosome translocation, i.e., for use in Northern 
blotting detection, are contemplated to be those based 
upon the cDNA clones 14P-18B (seg id no: 4) and 14-7 (seq 
id no:5). m such Northern blotting detection, the use 
of cDNA clone 14-7 itself (seq id no: 5) and various 
fragments of clone 14P-18B (seq id no: 4) is contemplated. 
The use of 14P-18B fragments in Northern blotting is 
generally preferred, with the nucleic acid fragments 
termed MLL 0.7B (0.7B, seq id no:l), MLL 0.3BE (0.3BE, 
seq id no:2) and MLL 1.5EB (1.5BE, seq id no:3) being 
particularly preferred. 



10 



15 



20 



The use of a combination of the probes described 
above may provide further advantages in certain cases as 
it may allow the differentiation of further distinct MLL 
gene transcripts. An example of this is presented herein 
m the case of the RS4;ll cell line. Here, it is 
demonstrated herein that normal cells contain an MLL gene 
transcript of estimated length 11.5 kb and that RS4;ll 
25 leukemic cells have a reduced amount of this normal 

transcript (in common with their reduced amount of the 
12.5 kb and 12.0 kb normal transcripts). However, the 
inventors have also determined that the RS4;li leukemic 
cells contain an aberrant mRNA transcript, also estimated 
3 0 to be about 11.5 kb in length, which is present in 
significant quantities and may even be termed over- 
expressed (a specific increase in the level of an mRNA 
transcript in comparison to the level in normal cells is 
indicative of "over-expression") . 



35 



The probe termed 1.5EB (seq id no: 3) is herein shown 
to detect the normal 11. 5 kb transcript, and a weak 
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signal in a Northern blot employing this probe is 
therefore indicative of a leukemic cell containing an 
llq2 3 chromosome translocation. Each of the more 
telomeric probes, namely 0.7B, 0.3BE and 14-7, (seq id 
5 nos:l, 2, and 5, respectively) are shown to detect the 
over-expressed, aberrant, 11.5 kb transcript in RS4;11 
cells, and a strong signal in a Northern blot employing 
any of these probes therefore characterizes a leukemic 
cell with an RS4;ll-like translocation. A further 

10 advantage of the present invention is, therefore, that in 
using more than one probe, it provides methods by which 
to differentiate between normal and aberrant transcripts 
which may be similar in size, and thus increases the 
number of factors with which to differentiate between 

15 leukemic and normal cells. 

The probes of the present invention may also be used 
to identify leukemic cells containing llq23 chromosome 
translocations in situ, that is, without extraction of 

2 0 the genetic material. Fluorescent in situ hybridization 

(FISH) , which allows cell nuclei to be analyzed directly, 
is one method which is considered to be particularly 
suitable for use in accordance with the present 
invention* Cells may be analyzed in metaphase, a stage 
25 in cell division wherein the chromosomes are individually 
distinguishable due to contraction. However, the methods 
and compositions of the present invention are 
particularly advantageous in that they are equally 
suitable for use with interphase cells, a stage wherein 

3 0 chromosomes are so elongated that they are entwined and 

cannot be individually distinguished. 

Cloned DNA probes from both sides of the 
translocation breakpoint region can be used with FISH to 
3 5 detect the translocation in leukemic cells. In normal 

cells, these two probes would be together and they would 
appear as a single signal. In cells with a 
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trans location, the centromeric probe would remain on the 
derivative 11 chromosome whereas the telomeric probe 
would be translated to the other derivative chromosome. 
This would result in two smaller signals, one on each 
5 translocation partner. As the inventors have shown that 
about 3 0% of patients have a deletion of the MLL gene 
immediately telomeric to the breakpoint, they have cloned 
a series of telomeric probes that can be used reliably to 
detect the translocation in virtually all patients. 

10 

Whether employing Southern, blotting, Northern 
blotting, FISH, or any other amenable techniques, the 
present invention provides improved methods for analyzing 
cells from patients suspected of having a leukemia 
15 associated with an llq23 chromosome translocation. In 
that the probes disclosed herein are able to detect DNA 
rearrangements in all patients with the common llq23 
translocations, i.e., there are no false-negatives, their 
use represents a significant advance in the art. 

20 

This invention will be particularly useful in the 
analysis of individuals who have already had one 
malignant disease that has been treated with certain 
drugs that induce leukemia with llq2 3 translocations in 
25 10 to 25% of patients (Ratain & Rowley, 1992) . Thus 

cells from these patients can be monitored with Southern 
blot analysis, PGR and FISH to detect cells with an llq23 
translocation and thus identify patients very early in 
the course of their disease. In addition, the probes 
described in this invention can be used to monitor the 
response to therapy of leukemia patients known to have an 
llq2 3 translocation. These leukemic cells show a 
substantial decrease in frequency in response to therapy . 



30 
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In further embodiments, the present invention 
concerns compositions comprising nucleic acid segments, 
and particularly DNA segments, isolated free from total 
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genomic DNA, which have a sequence in accordance with, or 
complementary to, the sequence of cDNA clone 14P-18B (seq 
id no: 4) or cDNA clone 14-7 (seq id no: 5) derived from 
the MLL gene. Such DNA segments are exemplified by the 
5 clones 14P-18B (seq id no: 4) and 14-7 (seq id no: 5) 
themselves, and also by various fragments of such 
sequences. cDNA clones 14P-18B and 14-7 may be 
characterized as being derived from the MLL gene, as 
being about 4.1 kb and about 1.3 kb in length, 
10 respectively, and as having restriction patterns as 
indicated in Figure 1 and Figure 2. 

The invention provides probes which span the MLL 
breakpoint, e.g., 0.7B; probes centromeric to the 

15 breakpoint, e.g., 1.5EB, and probes telomeric to the 
breakpoint, e.g., 0.3BE, 14-7, and even 0.8E. 
Particularly preferred DNA segments of the present 
invention are those DNA segments represented by the 
nucleic acid fragments, or probes, termed MLL 0.7B (0.7B, 

20 seq id no:l), MLL 0.3BE (0.3BE, seq id no:2) and MLL 
1.5EB (1.5BE, seq id no: 3). 

The nucleic acid segments and probes of the present 
invention are contemplated for use in detecting cells, 

25 and particularly, cells from human subjects, which 

contain an llq2 3 chromosome translocation. However, they 
are not limited to such uses and also have utility in a 
variety of other embodiments, for example, as probes or 
primers in nucleic acid hybridization embodiments. The 

30 ability of these nucleic acid segments to specifically 

hybridize to MLL gene-like sequences will enable them to 
be of use in various assays to detect complementary 
sequences, other than for diagnostic purposes. The use 
of such nucleic acid segments as primers for the cloning 

35 of further portions of genomic DNA, or for the 

preparation of mutant species primers, is particularly 
contemplated. The DNA segments of the invention may also 
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be employed in recombinant expression. For example, as 
disclosed herein, they have be used in the production of 
peptides or proteins for further analysis or for antibody 
generation. 

5 

The present invention also embodies kits for use in 
the detection of leukemic cells containing llq23 
chromosome translocations. Kits for use in both Southern 
and Northern blotting and in FISH protocols are 

10 contemplated, and such kits will generally comprise a 

first container which includes one or more nucleic acid 
probes which include a sequence in accordance with the 
sequences of nucleic acid probes MLL 0.7B (seq id no:l), 
MLL 0.3BE (seq id no:2), MLL 1.5EB (seq id no:3) or 14-7 

15 (seq id no: 5), and a second container which comprises one 
or more unrelated nucleic acid probes for use as a 
control. In preferred embodiments, such kits will 
include one or more of the nucleic acid probes termed MLL 
0.7B (seq id no:l), MLL 0.3BE (seq id no:2), MLL 1.5EB 

20 (seq id no: 3) or 14-7 (seq id no: 5) themselves, and kits 
for use in connection with FISH or Northern blotting 
will, most preferably, include all such nucleic acid 
probes" or segments. 

25 Kits for the detection of leukemic cells containing 

llq23 chromosome translocations by Southern blotting may 
also include a third container which includes one or more 
restriction enzymes. Particularly preferred Southern 
blotting kits will be those which include the nucleic 

3 0 acid probe MLL 0.7B (seq id no:l) and the restriction 

enzyme BamHl. Naturally, kits for use in connection with 
FISH will contain one or more nucleic acid probes which 
are fluorescent ly labelled. 

3 5 Further embodiments of the present invention concern 

MLL peptides, polypeptides, proteins, and fusions thereof 
and antibodies having binding affinity for such proteins, 
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peptides and fusions. The invention therefore concerns 
proteins or peptides which include an MLL amino acid 
sequence, purified relative to their natural state. Such 
proteins or peptides may contain only MLL sequences 
themselves or may contain MLL sequences linked to other 
protein sequences, such as, e.g., x natural ' sequences 
derived from other chromosomes or portions of 
* engineered' proteins such as glutathione-S-transf erase 
(GST) , ubiquitin, 8-galactosidase and the like. 



Proteins prepared in accordance with the invention 
may include MLL amino acid sequences which are either 
telomeric or centromeric to the breakpoint region, as 
exemplified by the amino acid sequences of seq id no: 8 

15 and amino acids 323-623 of seq id no: 7, respectively. 

Other proteins which are contemplated to be particularly 
useful are those including a zinc finger region from seq 
id no: 7, such as those generally located between amino 
acids 574-1184, and more particularly, those including 

20 amino acids 574 to about 810 and about 1057 to 1184 of 

seq id no: 7. Antibodies prepared in accordance with the 
invention may be directed against any of the 
x centromeric / or ^telomeric' proteins described herein, 
or portions thereof, with antibodies against the zinc 

25 finger regions of seq id no: 7 being particularly 
contemplated. 



BRIEF DESCRIPTION OF THE DRAWINGS 



Figure 1 . 

Alignment of cDNA clones of the MLL gene with genomic 
sequences. The top thick solid line represents the 
genomic sequence in which not all the restriction sites 
35 are indicated. The sizes above the line 14 kb, 8.3 kb 

and -2 0 kb refer to the BamHl fragments. The two dashed 
lines located above the 14 kb BamHl genomic fragment 
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indicate the 2 . lkb BamHl/Sstl telomeric fragment (14BS) , 
and the 0.8 kb PstI centromeric fragment (14P) used to 
screen the cDNA library . The solid line under each cDNA 
clone indicates the region of homology between clones. 
5 The predicted direction of transcription of MLL and the 
open reading frame of clone 14-7 is indicated by the 
arrow. Restriction enzymes used; B, BamKXj S, Sstl; Sa, 
Sail; P, PstI; H, Jfijadlll; X, Xhol; E , EcoRI ; Bg, Bgll. 

10 Figure 2. 

A map of cDNA clones 14-7 and 14P-18B. Restriction 
enzymes are the same as in Figure l. The solid lines 
below the cDNA clones indicate the cDNA fragments used in 
the Southern and Northern hybridizations. All of clone 

15 14-7, and three adjacent fragments of 0.3 kb JBaMIl/EcoRl 
(MLL 0.3BE), 0.7 kb BamHl (MLL 0.7B) and 1 . 5 kb 
EcoRI/ BamHl (MLL 1.5EB) from cDNA clone 14P-18B were 
used. Note that the EcoRI site used to excise the 1.5 kb 
fragment was a cloning EcoRI site. The breakpoint region 

2 0 within the 0.7 kb BamHl fragment is also shown, as is the 
0.8 kb EcoRI probe (MLL 0.8E) employed in analyzing the 
Karaps 45 cell line. It will be noted that the 
orientation of the probes represented in this figure is 
reversed to that in sequence 14P-18B (seq id no: 4) , where 

25 MLL 1.5EB is first, MLL 0.7B is next and MLL 0.3BE is 
last. 



Figure 3 . 

Southern blot of DNA from cell lines and patient leukemic 
cells with llq2 3 translocations digested with BamHl and 
hybridized to MLL 0.7B. Lanes 1, 7, control DNA; lane 
2, RS4;ll cell line; lanes 3-5, patients 1-3 (as detailed 
in Table 1) , lane 6, Sup-T13 cell line showing weak 
hybridization to two rearranged bands of 7.0 kb and 
1.4 kb, lane 8, RC-K8 cell line. DNA fragment sizes in 
kilobases are shown on the left. 
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Figure 4 . 

Northern blot analyses of poly(A) + RNA. Poly (A) + RNA was 
isolated from cell lines in logarithmic growth phase 
except where noted, RNA sizes are indicated on the left. 
5 Figure 4 consists of Figure 4A and Figure 4B. 

Figure 4 A, Each lane 1 is the RCH-ADD cell line; each 
lane 2 is the RC-K8 cell line and each lane 3 is the 
RS4;11 cell line in stationary growth phase. The 
Northern blots in this panel were hybridized sequentially 
10 to the 14-7 probe, (a); the MLL 0.7B probe, (b) ; and the 
MLL 1.5EB probe, (c) . Hybridization to actin is also 
shown in this panel in (a) . 

Figure 4 B. RNA from the RS4;ll cell line. The Northern 
blots in this panel were hybridized in the same manner to 
15 the 14-7 probe, (a); the MLL 0.3BE probe, (b) ; the MLL 
0.7B probe, (c) ; and the MLL 1.5EB probe, (d) . 

Figure 5 . 

Schematic representation of the Northern blot results 
20 obtained from the sequential hybridization of probes (14- 
7, MLL 0.3BE, MLL 0 . 7B and MLL 1.5EB) to control (C) and 
RS4;ll cell line (4; 11) RNA. Only the large size 
transcripts are shown. The solid lines indicate normal 
sized transcripts of normal mRNA with estimated sizes of 
25 12.5, 12.0 and 11.5 kb which are detected in both control 
and RS4;11 cell lines. The dashed lines represent the 
aberrant sized transcripts with estimated sizes of 11.5, 
11.25 and 11.0 kb detected in the RS4;11 cell line. In 
the RS4;11 cell line the normal and altered (estimated) 
30 11.5 kb mRNA transcripts are indicated by an overlapping 
broken and solid line. The line thickness indicates the 
strength of the hybridization signal. The chromosomal 
origin of each transcript is depicted on the right. 

35 Figure 6. 

Southern hybridization of patient DNA digested with BamKl 
and probed with the 0.7 kilobase BaraHI cDNA fragment. 
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Sizes are in kilobases. Lane 1: Normal peripheral white 
blood cell DNA, Lane 2: AML with t ( 1 ; 11) (q2 1 ;q23 ) , Lane 
3: ALL with t (4 ; 11) (q21 ;q23 ) , Lane 4: ALL with 
t(4;ll) (q21;q23) , Lane 5: ALL with t (4 ; 11) (q21;q23 ) , Lane 
5 6: ALL with t (4 ; 11) (q21;q23 ) , Lane 7: ALL with 

t(4;ll) (q21;q23) , Lane 8: AML with t ( 6 ; 11) (q27 ;q23 ) , 
Lane 9: AML with t ( 6 ; 11) (q27 ;q23 ) , Lane 10: AML with 
t(9;ll) (p22;q23) , Lane 11: AML with t (10; 11) (pl3 ;q21) , 
Lane 12: Lymphoma with t (10 ; 11) (pl5 ;q22 ) , Lane 13: AML 
10 with ins(10;ll) (pll;q23q24) , Lane 14: AML with 
ins(10;ll) (pl3;q21q24) , Lane 15: ALL with 
t(ll;19) (q23;pl3.3) , Lane 16, AML with 
t(ll;19) (q23;pl3.3) , Lane 17: AML with 

t(ll;22) (q23;ql2) . A single germline band was detected 
15 in normal DNA in lane 1 and in patient samples with non- 
llq23 breakpoints in lanes 11, 12, and 14. 
Rearrangements were detected in all other lanes. Lanes 
2, 3, 4, 6, 7, 8, 10, 13, 16, 17 had two rearranged 
bands, and lanes 5, 9, and 15 had one rearranged band. 

20 

Figure 7 . 

Southern hybridization of leukemic and normal DNA 
digested with BamKl and probed with the 0.7 kilobase 
BamHI cDNA fragment and with the centromeric and 

25 telomeric PCR-derived probes. Sizes are in kilobases. 

Figure 7 consists of Figure 7A, Figure 7B and Figure 7C. 
Figure 7 A. DNA probed with 0.7 kilobase cDNA probe. 
Lane 1: Biphenotypic leukemia with t (11;19) (q23;pl3 . 3) , 
lane 2: ALL with t ( 11 ; 19 ) (q23 ;pl3 . 3 ) , lane 3: AML with 

30 t(ll;19) (q23;pl3.3) , lane 4: normal DNA, lane 5: AML 

with t (6;ll) (q27;q23) , lane 6: Follicular lymphoma with 
t (6;11) (pl2 ;q23) . A single germline 8.3 kilobase band is 
identified in normal DNA in lane 5 and is also present in 
all other lanes. Two rearranged bands, corresponding to 

3 5 the two derivative chromosomes, are identified in lanes 
1, 2, and 3. A single rearranged band is present in 
lanes 5 and 6 . 
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Figure 7 B: The blot from panel A was stripped and 
rehybridized with the centromeric PCR probe. The 
germline 8.3 kilobase. band is again present in all lanes. 
In lanes 1-3 , one of the two rearranged bands is 
5 detected. In lane 3, the rearranged band is slightly 
larger than the germline band. In lanes 5 and 6, the 
single rearranged band is also identified. 
Figure 7 C: The blot from panel A was stripped and then 
rehybridized with the telomeric PCR probe. The germline 
10 band is present in all lanes. In lanes 1-3, one of the 
two rearranged bands is identified. In lane 2, the 
rearranged band is slightly smaller than the germline 
band. However, the single rearranged band in lanes 5 and 
6 is not detected. 

15 

Figure 8 . 

Southern hybridization of patient DNA digested with BajnHI 
and probed with 0.7 kilobase BajnHI cDNA fragment and with 
the centromeric and telomeric PCR-derived probes. , : Lane 
20 1: AML with t(l;ll) (q21;q23) - same patient as in lane 2 
of Figure 7. Lane 2: ALL with t (4 ; 11) (q21 ;q23 ) - the 
same patient as shown in lane 6 of Figure 7 . Figure 8 
consists of Figure 8A, Figure 8B and Figure 8C. 
Figure 8 A. DNA probed with the 0.7 kilobase cDNA probe. 

2 5 The germline band and two rearranged bands are present in 

both lanes. 

Figure 8 B. The blot from panel A was stripped and 
rehybridized with the centromeric PCR probe. The 
germline band and both rearranged bands are again 

3 0 detected. 

Figure 8 C. The blot from panel A was stripped and then 
rehybridized with the telomeric PCR probe. The germline 
band and only one of the rearranged bands are detected. 

35 Figure 9. Representation of the 8.3 kb BajnHI Genomic 
Section of the MLL gene and Various cDNA Probes. 
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Figure 10. Reactivity of Specific anti-MLL Antisera 
Directed Against the MLL Amino Acids of Seq Id No: 8. 
Western blots of pre-iminune sera (lanes 1, 7 & 8) and 
high titer rabbit antisera (lanes 2-6, 9 & 19) specific 
5 for the MLL portion of the MLL-GST fusion protein. The 

creation of an expression vector for the production of an 
MLL amino acid-containing fusion protein containing MLL 
amino acids of seq id no: 8 and GST is described in 
Example IV. 

10 

Figure 11. Southern blot analysis of DNA from human 
placenta (C) and the Karpas 45 cell line (K45, 
t(X;ll) (ql3;q23) ) digested with BajnHl and hybridized to 
the 0.7B cDNA fragment of MLL (seq id no:l). DNA size 
15 markers are shown on the left and the lines on the right 
denote the rearranged DNA bands detected in the Karpas 4 5 
cell line. 

Figure 12. Northern blot analysis of RNA isolated from 

2 0 two control cell lines RC-K8 (C) and RCH-ADD (C) and the 

Karpas 45 cell line (K45) with a t (X; 11) (ql3 ;q23 ) 
translocation. The blot was sequentially hybridized to 
the 0.8E, 0.7B and 1.5EB cDNA fragments of the MLL gene. 
Hybridization to act in is also shown. The markers on the 
25 right denote the size of the detected transcripts, and 
the lines to the right of the blots locate the altered 
MLL transcripts seen in the Karpas 45 cell line. 

3 0 DETAILED DESCRIPTION 

OF THE PREFERRED EMBODIMENTS 

Introduction 

The molecular analysis of recurring structural 
35 chromosome abnormalities in human neoplasia has led to 

the identification of a number of genes involved in these 
rearrangements. These genetic alterations are implicated 
in the development of malignancies. For example, in 
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chronic myelogenous leukemia, the proto-oncogene ABL is 
translocated from chromosome 9 to the BCR gene on 
chromosome 22 leading to the generation of a chimeric 
gene and a fusion protein (Rowley, 1990b) . In lymphoid 
5 malignancies, translocations frequently involve the 
immunoglobulin or T-cell receptor genes which are 
juxtaposed to key oncogenes causing their abnormal 
expression (Rowley, 1990a) . 

10 Translocations involving chromosome band llq2 3 have 

been identified as a frequent cytogenetic abnormality in 
lymphoid and myeloid leukemias and in lymphomas 
(Sandberg, 1990) . In addition to leukemias that occur de 
novo, llq23 translocations are also observed in therapy 

15 related leukemias. The t(4;ll) has been reported in 2% 
to 7% of all cases of acute lymphoblastic leukemia (ALL) 
and in up to 60% of leukemias in children under the age 
of one year (Parkin et al . , 1982; Pui et al . , 1991; 
Kaneko et al . , 1988). By French-American-British (FAB) 

2 0 Cooperative Group criteria, these leukemias are usually 
classified morphologically as LI. Typically, these 
patients express myeloid or monocytoid markers in 
addition to the B-cell lymphoid markers (Kaneko et al . , 
1988; Drexler et al . , 1991). On flow cytometry, a 

25 characteristic phenotype, CD 10", CD 15 + , CD 19 + , CD 24" 
/+ , has been reported (Pui et al . , 1991). These patients 
often present with hyperleukocytosis and early central 
nervous system involvement (Arthur et al . , 1982). 

30 The t(ll;19) is more complex because two 

translocations involving different breakpoints in 19p 
with different phenotypic features have been identified. 
Approximately two-thirds have a t (ll;19) (q23 ;pl3 . 3) and 
include patients with ALL, biphenotypic leukemia, and 

35 infants or young children with AML. One-third have a 

t(ll;19) (q23;pl3.1) and are generally older children or 
adults with AML-M4 and M5. The t(4;ll) and the t(ll;l9) 
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have been recognized as a cytogenetic subset in ALL with 
a poor prognosis (Gibbons et al . , 1990). 

Translocations involving llq2 3 are frequent in acute 
myeloid leukemia (AML) and have also been found to occur 
preferentially in childhood (Fourth Int. Wksh. Cancer 
Gent. Cytogenet., 1984), The t(9;ll) and both t(ll;l9) 
are the most common, but other rearrangements, such as 
the t(6;ll) / an insertion (10;11) , and deletions 
involving llq2 3 have also been reported (Mitelman et al . , 
1991) . Morphologically these cases are usually 
categorized as acute myelomonocytic leukemia (AML-M4) or 
acute monoblastic leukemia (AML-M5) by FAB criteria. 
Similar to ALL, these patients often present with high 
15 leukemic blast cell counts. Ilq2 3 abnormalities have 
generally been considered to carry a poor prognosis in 
AML (Fourth Int. Wksh. Cancer Genet. Cytogenet . , 1984). 
However, the use of intensive chemotherapy in these 
patients has led to complete remission rates and 
20 remission durations that are similar to a group with 
favorable cytogenetic abnormalities (Samuels et al . , 
1988) . Many cases of AML with llq23 anomalies have been 
found, by flow cytometry, to express lymphoid markers 
(Cuneo et al . , 1992). 

25 

Abnormalities of llq23 have been found to be common 
in both the lymphoid and myeloid leukemias as well as in 
biphenotypic leukemias which have both lymphoid and 
myeloid features (Hudson et al . , 1991). This has led to 

30 the hypothesis that rearrangements of a gene at llq23 may 
affect a pluripotential progenitor cell capable of either 
myeloid or lymphoid differentiation. Alternatively, a 
mechanism for differentiation that is shared by both 
lymphoid and myelo-monocytic stem cells may be 

35 deregulated as a consequence of these translocations. 
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DNA segments and Nucleic Acid Hybridization 

As used herein, the term "DNA segment" in intended 
to refer to a DNA molecule which has been isolated free 
of total genomic DNA of a particular species. Therefore, 
5 DNA segments of the present invention will generally be 
MLL DNA segments which are isolated away from total human 
genomic DNA, although DNA segments isolated from other 
species, such as, e.g., Drosophila, may also be included 
in certain embodiments. Included within the term "DNA 
10 segment", are DNA segments which may be employed as 

probes, and those for use in the preparation of vectors, 
as well as the vectors themselves, including, for 
example, plasmids, cosmids, phage, viruses, and the like. 

15 The techniques described in the following detailed 

examples are the generally preferred techniques for use 
in connection with certain preferred embodiments of the 
present invention. However, in that this invention 
concerns nucleic acid sequences and DNA segments, it will 

2 0 be apparent to those of skill in the art that this 

discovery may be used in a wide variety of molecular 
biological embodiments . 

The DNA sequences disclosed herein will also find 
25 utility as probes or primers in modifications of the 

nucleic acid hybridization embodiments detailed in the 
following examples. As such, it is contemplated that 
oligonucleotide fragments corresponding to any of the 
cDNA or genomic sequences disclosed herein for stretches 

3 0 of between about 10 nucleotides to about 2 0 or to about 

3 0 nucleotides will have utility, with even longer 
sequences, e.g., 40, 50 or 100 bases, 1 kb, 2 kb or 4 kb, 
8.3 kb, 20 kb, 30 kb, 50 kb or even up to about 100 kb or 
more also having utility. The larger sized DNA segments 
35 in the order of about 20, 30, 50 or about 100 kb or even 
more, are contemplated to be useful in FISh embodiments. 
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The ability of such nucleic acid probes to 
specifically hybridize to AfiL-encoding or other MLL 
genomic sequences will enable them to be of use in a 
variety of embodiments. For example, the probes can be 
5 used in a variety of assays for detecting the presence of 
complementary sequences in a given sample. However, 
other uses are envisioned, including the use of the 
sequence information for mapping the precise breakpoints 
in individual patients, and for the preparation of mutant 
10 species primers or primers for use in preparing other 
genetic constructions. 

Nucleic acid molecules having stretches of 10, 20, 
30, 50, 100, 200, 500 or 1000 or so nucleotides or # even 
more, in accordance with or complementary to any of seq 
id no:l through seq id no: 6 will have utility as 
hybridization probes. These probes will be useful in a 
variety of hybridization embodiments, not only in 
Southern and Northern blotting in connection with 
analyzing patients' genes, but also in analyzing normal 
hematopoietic development and in charting the evolution 
of certain genes. The total size of fragment used, as 
well as the size of the complementary stretch (es) , will 
ultimately depend on the intended use or application of 
the particular nucleic acid segment. Smaller fragments 
will generally find use in hybridization embodiments, 
wherein the length of the complementary region may be 
varied, such as between about 10 and about 100 
nucleotides, up to 0 . 7 kb, 1 . 3 kb or 1 . 5 kb or even up to 
8.3 kb or more, according to the complementary sequences 
one wishes to detect. 

The use of a hybridization probe of about 10 
nucleotides in length allows the formation of a duplex 
35 molecule that is both stable and selective. Molecules 
having complementary sequences over stretches greater 
than 10 bases in length are generally preferred, though, 
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in order to increase stability and selectivity of the 
hybrid, and thereby improve the quality and degree of 
specific hybrid molecules obtained. One will generally 
prefer to design nucleic acid molecules having gene- 
5 complementary stretches of 15 to 2 0 nucleotides, or even 
longer where desired. Such fragments may be readily 
prepared by, for example, directly synthesizing the 
fragment by chemical means, by application of nucleic 
acid reproduction technology, such as the PCR technology 
10 of U.S. Patent 4,603,102 (herein incorporated by 

reference) or by introducing selected sequences into 
recombinant vectors for recombinant production. 

Accordingly, the nucleotide sequences of the 
15 invention may be used for their ability to selectively 
form duplex molecules with complementary stretches of 
WI/Ii-like genes or cDNAs. Depending on the application - 
envisioned, one will desire to employ varying conditions 
of hybridization to achieve varying degrees of ~ 

2 0 selectivity of probe towards target sequence. For 

applications requiring high selectivity, one will 
typically desire to employ relatively stringent 
conditions to form the hybrids, e.g., one will select 
relatively low salt and\or high temperature conditions, 
25 such as provided by 0.02M-0.15M NaCl at temperatures of 

50°C to 70°C. Such selective conditions tolerate little, 
if any, mismatch between the probe and the template or 
target strand, and would be particularly suitable for 
isolating /LLL-like genes, for example, to gather 

3 0 information on the gene in different cell types or at 

different stages of the cell's cycle. 

Of course, for some applications, for example, where 
one desires to prepare mutants employing a mutant primer 
3 5 strand hybridized to an underlying template or where one 
seeks to isolate WLL-encoding sequences from related 
species, functional equivalents, or the like, less 
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stringent hybridization conditions will typically be 
needed in order to allow formation of the heteroduplex. 
In these circumstances, one may desire to employ 
conditions such as 0.15M-0.9M salt, at temperatures 
5 ranging from 20°C to 55°C. Cross-hybridizing species can 
thereby be readily identified as positively hybridizing 
signals with respect to control hybridizations. In any 
case, it is generally appreciated that conditions can be 
rendered more stringent by the addition of increasing 

10 amounts of formamide, which serves to destabilize the 
hybrid duplex in the same manner as increased 
temperature. Thus, hybridization conditions can be 
readily manipulated, and thus will generally be a method 
of choice depending on the desired results. Less 

15 stringent conditions would be suitable for identifying 
related genes, such as, for example, further drosophila 
or yeast genes, or genes from any organism known to be 
interesting from an evolutionary or developmentally stand 
point . 

20 

In certain embodiments, it will be advantageous to 
employ nucleic acid sequences of the present invention in 
combination with an appropriate means, such as a label, 
for determining hybridization. A wide variety of 

2 5 appropriate indicator means are known in the art, 

including fluorescent , . radioactive, enzymatic or other 
ligands, such as avidin/biotin, which are capable of 
giving a detectable signal. In preferred embodiments, 
one will likely desire to employ a fluorescent label or 
30 an enzyme tag, such as urease, alkaline phosphatase or 

peroxidase, instead of radioactive or other environmental 
undesirable reagents. In the case of enzyme tags, 
colorimetric indicator substrates are known which can be 
employed to provide a means visible to the human eye or 

3 5 spectrophotometrically, to identify specific 

hybridization with complementary nucleic acid-containing 
samples. 
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In general, it is envisioned that the hybridization 
probes described herein will be useful both as reagents 
in solution hybridization as well as in embodiments 
employing a solid phase. In embodiments involving a 
5 solid phase, the test DNA (or RNA) is adsorbed or 

otherwise affixed to a selected matrix or surface. This 
fixed, single-stranded nucleic acid is then subjected to 
specific hybridization with selected probes under desired 
conditions. The selected conditions will depend on the 

10 particular circumstances based on the particular criteria 
required (depending, for example, on the G+C contents, 
type of target nucleic acid, source of nucleic acid, size 
of hybridization probe, etc.)- Following washing of the 
hybridized surface so as to remove nonspecif ically bound 

15 probe molecules, specific hybridization is detected, or 
even quantified, by means of the label. 

It is contemplated that longer DNA segments will 
find utility in the recombinant production of peptides or 
2 0 proteins. DNA segments which encode peptides of from 
about 15 to about 50 amino acids in length, or more 
preferably, from about 15 to about 3 0 amino acids in 
length are contemplated to be particularly useful in 
certain embodiments, e.g., in raising anti-peptide 

2 5 antibodies. DNA segments encoding larger polypeptides, 

domains, fusion proteins or the entire MLL protein will 
also be useful. DNA segments encoding peptides will 
generally have a minimum coding length in the order of 
about 4 5 to about 90 or 150 nucleotides, whereas DNA 

3 0 segments encoding larger MLL proteins, polypeptides, 

domains or fusion proteins may have coding segments 
encoding about 350, 430 or about 650 amino acids, and may 
be about 1.2 kb, 4 . lkb or even about 8.3kb in length. 

3 5 The nucleic acid segments of the present invention, 

regardless of the length of the coding sequence itself, 
may be combined with other DNA sequences, such as 
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promoters, polyadenylation signals, additional 
restriction enzyme sites, multiple cloning sites, other 
coding segments, and the like, such that their overall 
length may vary considerably. It is contemplated that a 
5 nucleic acid fragment of almost any length may be 

employed, with the total length preferably being limited 
by the ease of preparation and use in the intended 
recombinant DNA protocol. For example, nucleic acid 
fragments may be prepared in accordance with the present 
10 invention which are up to 20,000 base pairs in length, as 
may segments of 10,000, 5,000 or about 3,000, or of about 
1,000 base pairs in length or less. 

It will be understood that this invention is not 
15 limited to the particular nucleic and amino acid 

sequences of seq id nos:l through 6 and seq id nos:7 and 
8, respectively. Therefore, DNA segments prepared in 
accordance with the present invention may also encode 
biologically functional equivalent proteins or peptides 
2 0 which have variant amino acids sequences. Such sequences 
may arise as a consequence of codon redundancy and 
functional equivalency which are known to occur naturally 
within nucleic acid sequences and the proteins thus 
encoded. Alternatively, functionally equivalent proteins 

2 5 or peptides may be created via the application of 

recombinant DNA technology, in which changes in the 
protein structure may be engineered, based on 
considerations of the properties of the amino acids being 
exchanged. 

30 

DNA segments encoding an MLL gene may be introduced 
into recombinant host cells and employed for expressing 
the encoded protein. Alternatively, through the 
application of genetic engineering techniques, 

3 5 subportions or derivatives of selected MLL genes may be 

employed. Equally, through the application of site- 
directed mutagenesis techniques, one may re-engineer DNA 
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segments of the present invention to alter the coding 
sequence, e.g., to introduce improvements to the 
antigenicity of the protein or to test MLL protein 
mutants in order to examine the structure-function 
5 relationships at the molecular level. Where desired, one 
may also prepare fusion peptides, e.g., where the MLL 
coding regions are aligned within the same expression 
unit with other proteins or peptides having desired 
functions, such as for immunodetection purposes (e.g., 
10 enzyme label coding regions) , for stability purposes, for 
purification or purification and cleavage, or to impart 
any other desirable characteristic to an MLL-based fusion 
product . 

15 MLL Protein Expression, Purification and Uses 

In certain embodiments, DNA segments encoding MLL 
protein portions may be produced and employed to express 
the MLL proteins, domains or fusions thereof. Such DNA 
segments will generally encode proteins including MLL 

20 amino acid sequences of between about 100, 200, 250, 300 
or about 650 amino acids, although longer sequences up to 
and including about 3800 or 3968 MLL amino acids are also 
contemplated. MLL protein regions which are both 
telomeric and centromeric to the breakpoint region may be 

25 produced, as exemplified herein by the generation of 

fusion proteins including MLL amino acids set forth in 
seq id no: 8 and by amino acids 323-623 of seq id no: 7. 
Other specific regions contemplated by the inventors to 
be particularly useful include, for example, the zinc 

30 finger regions represented by amino acids 574-1184, and 
more particularly, those including amino acids 574 to 
about 810 and about 1057 to 118 4 of seq id no: 7. 

As a point of comparison with other nomenclature 
3 5 currently used in the art, the MLL amino acids of clone 
14-7 (seq id no:8), telomeric to the breakpoint region, 
correspond to the HRX amino acids 2772-3209 in Figure 4 
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of Tkachuk et al . (1992), and the MLL amino acids 323-623 
of clone 14P-18B (seq id no: 7), centromeric to the 
breakpoint region, correspond to the HRX amino acids 
1101-1400 (Tkachuk et al., 1992). It should also be 
5 noted here that the cDNA clone 14P-18B (seq id no: 4) 
differs from the published sequence of Tkachuk et al . 
(1992) in that clone 14P-18B lacks exon 8 sequences. 
This arose as a result of using a cDNA obtained 
subsequent to an alternative splicing reaction. Such 
10 alternative splicing is known to occur in other zinc 

finger proteins, such as the Wilms tumor protein. The 
zinc finger regions in the Tkachuk et al . sequence are 
represented generally by amino acids 1350-1700 and 1700- 
2000. 

15 

The expression and purification of MLL proteins is 
exemplified herein by the generation of MLL fusion 
proteins including glutathione S transferase, by their 
expression in E. coli, and by the use of glutathione- 

20 agarose affinity chromatography. However, it will be 

understood that there are many methods available for the 
recombinant expression of proteins and peptides, any or 
all of which will likely be suitable for use in 
accordance with the present invention. MLL proteins may 

25 be expressed in both eukaryotic and prokaryotic 

recombinant host cells, although it is believed that 
bacterial expression has advantages over eukaryotic 
expression in terms of ease of use and quantity of 
materials obtained thereby. 

30 

MLL proteins and peptides produced in accordance 
with the present invention may contain only MLL sequences 
themselves or may contain MLL sequences linked to other 
protein or peptide sequences. The MLL segments may be 
3 5 linked to other ^natural' sequences, such as those 

derived from other chromosomes, and also to x engineered' 
protein or peptide sequences, such as glutathione-S- 
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transferase (GST) , ubiquitin, B-galactosidase, 
B-lactamase, antibody domains and, infact, virtually any 
protein or peptide sequence which one desires. The use 
of enzyme sensitive peptide sequences, such as , e.g., 
5 those found in the blood clotting cascade proteins, is 

also contemplated. One such application involves the use 
of a fusion protein domain for purification, e.g., using 
affinity chromatography, and then the subsequent cleavage 
of the fusion protein by a specific enzyme to release the 
10 MLL portion of the fusion protein. 

As used herein, the term "engineered" or 
"recombinant" cell is intended to refer to a eukaryotic 
or prokaryotic cell into which a recombinant MLL DNA 

15 segment has been introduced. Therefore, engineered cells 
are distinguishable from naturally occurring cells which 
do not contain recombinantly introduced DNA, i.e., DNA 
introduced through the hand of man. Recombinantly 
introduced DNA segments will generally be in the form of 

2 0 cDNA (i.e., they will not contain introns) , although the 
use of genomic MLL sequences is not excluded. 

For protein expression, one would position the 
coding sequences adjacent to and under the control of a 

2 5 promoter. It is understood in the art that to bring a 

coding sequence under the control of a promoter, one 
positions the 5' end of the transcription initiation site 
of the transcriptional reading frame of the protein 
between about l and about 50 nucleotides "downstream" of 
30 (i.e., 3' of) the chosen promoter. Where eukaryotic 
expression is contemplated, one will also typically 
desire to incorporate into the transcriptional unit an 
appropriate polyadenylation site (e.g., 5 ' -AATAAA-3 ' ) if 
one was not contained within the original cloned segment. 

3 5 Typically, the poly A addition site is placed about 30 to 

2000 nucleotides "downstream" of the termination site of 
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the protein at a position prior to transcription 
termination. 

The promoters used will generally be recombinant or 
5 heterologous promoters. As used herein, a recombinant or 
heterologous promoter is intended to refer to a promoter 
that is not normally associated with a the MLL gene in 
its natural environment. Such promoters may include 
virtually any promoter isolated from any bacterial or 

10 eukaryotic cell. Naturally, it will be important to 

employ a promoter that effectively directs the expression 
of the DNA segment in the cell type chosen for 
expression. The use of promoter and cell type 
combinations for protein expression is generally known to 

15 those of skill in the art of molecular biology, for 
example, see Sambrook et al . (1989). The promoters 
employed may be constitutive, or inducible, and can be 
used under the appropriate conditions to direct high 
level expression of the introduced DNA segment, such as 

2 0 is advantageous in the large-scale production of 
recombinant proteins or peptides. 

Further aspects of the present invention concern the 
purification or substantial purification of MLL-based 

2 5 proteins. The term "purified" as used herein, is 

intended to refer to a composition which includes a 
protein incorporating an MLL amino acid sequence, wherein 
the protein is purified to any degree relative to its 
naturally-obtainable state. The "naturally-obtainable 

3 0 state" may be relative to the purity within a human cell 

or cell extract, e.g., for an MLL fusion protein produced 
in leukemic cells of a given patient, or may be relative 
to the purity within an engineered cell or cell extract, 
e.g. , for a man-made MLL fusion protein. 

35 

Generally, "purified" will refer to an MLL protein 
or MLL peptide composition which has been subjected to 



WO 93/25713 




PCT/US93/05857 



fractionation to remove various non-MLL protein 
components such as other cell components. Various 
techniques suitable for use in protein purification will 
be well known to those of skill in the art. These 
5 include, for example, precipitation with ammonium 
sulphate, PEG, antibodies and the like or by heat 
denaturation, followed by centrifugation; chromatography 
steps such as ion exchange, gel filtration, reverse 
phase, hydroxy lapat it e and affinity chromatography; 

10 isoelectric focusing; gel electrophoresis; and 

combinations of such and other techniques. A specific 
example presented herein is the purification of MLL: GST 
fusion proteins using glutathione-agarose affinity 
chromatography, followed by preparative SDS- 

15 polyacrylamide gel electrophoresis and electroelution. 

The recombinant peptides or proteins produced from 
the DNA segments of the present invention will have uses 
in a variety of embodiments. For example, peptides, 
2 0 polypeptides and full-length proteins may be employed in 
the generation of antibodies directed against the MLL 
protein and antigenic sub-portions of the protein. 
Techniques for the production of polyclonal and 
monoclonal antibodies are described hereinbelow and are 

2 5 well known to those of skill in the art. The production 

of antibodies would be particularly useful as this would 
enable further detailed analyses of the location and 
function of the MLL protein, and MLL-related species, 
which clearly have an important role in mammalian cells 

3 0 and other cell types. The proteins may also be employed 

in various assays, such as DNA binding assays, and 
proteins and peptides may be employed to define the 
precise regions of the MLL protein which interact with 
targets, such as DNA, receptors, enzymes, substrates, and 
3 5 the like. 
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Recombinant Host Cells and Vectors 

Prokaryotic hosts are generally preferred for 
expression of MLL proteins. Examples of useful 
prokaryotic hosts include E. coli , such as strain JM101 
5 which is particularly useful, Bacillus subtilis , 

Salmonella, typhimurium, Serratia marcescens , and various 
Pseudomonas species. In general, plasmid vectors 
containing replicon and control sequences which are 
derived from species compatible with the host cell should 

10 be used in connection with these hosts. Such vectors 
ordinarily carry a replication site and a compatible 
promoter as well as marking sequences which are capable 
of providing phenotypic selection in transformed cells, 
such as genes for ampicillin or tetracycline resistance. 

15 Those promoters most commonly used in recombinant DNA 

construction include the B-lactamase (penicillinase) and 
lactose promoter systems and the tryptophan (trp) 
promoter system. 

2 0 In addition to prokaryotes , eukaryotic microbes, 

such as yeast cultures may also be used. Saccharomyces 
cerevisiae (common baker's yeast) is the most commonly 
used among eukaryotic microorganisms, although a number 
of other strains are commonly available. For expression 
25 in Saccharomyces, the plasmid YRp7 , containing the trpl 
gene is commonly used. Suitable promoting sequences in 
yeast vectors include the promoters for 
3-phosphoglycerate kinase or other glycolytic enzymes 
such as enolase, glyceraldehyde-3 -phosphate 

3 0 dehydrogenase, hexokinase, pyruvate decarboxylase, 

phosphof ructokinase, glucose-6-phosphate isomerase, 3- 
phosphoglycerate mutase, pyruvate kinase, triosephosphate 
isomerase, phosphoglucose isomerase, and glucokinase. In 
constructing suitable expression plasmids, the 
3 5 termination sequences associated with these genes are 
also ligated into the expression vector 3' of the 
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sequence desired to be expressed to provide 
polyadenylation of the mRNA and termination* 

Other promoters, which have the additional advantage 
5 of transcription controlled by growth conditions are the 
promoter region for alcohol dehydrogenase 2, 
isocytochrome C, acid phosphatase, degradative enzymes 
associated with nitrogen metabolism, and the 
aforementioned glyceraldehyde-3-phosphate dehydrogenase, 
10 and enzymes responsible for maltose and galactose 

utilization. Any plasmid vector containing a yeast- 
compatible promoter, an origin of replication, and 
termination sequences is suitable. 

15 In addition to microorganisms, cultures of cells 

derived from multicellular (eukaryotic) organisms may 
also be used as hosts. In principle, any such cell 
culture is workable, whether from vertebrate or 
invertebrate culture. However, interest has been 

2 0 greatest in vertebrate cells, and propagation of 

vertebrate cells in culture (tissue culture) has become a 
routine procedure in recent years. Examples of such 
useful host cell lines are VERO and HeLa cells, Chinese 
hamster ovary (CHO) cell lines, and W138, BHK, COS-7 , 293 
25 and MDCK cell lines. Expression vectors for such cells 
ordinarily include (if necessary) an origin of 
replication, a promoter located in front of the gene to 
be expressed, along with any necessary ribosome binding 
sites, RNA splice sites, polyadenylation site, and 

3 0 transcriptional terminator sequences. 

For use in mammalian cells, the control functions on 
the expression vectors are often provided by viral 
material. For example, commonly used promoters are 
35 derived from polyoma, Adenovirus 2, and most frequently 
Simian Virus 40 (SV4 0) . The early and late promoters of 
SV40 virus are particularly useful because both are 
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obtained easily from the virus as a fragment which also 
contains the SV40 viral origin of replication- Smaller 
or larger SV4 0 fragments may also be used, as may 
adenoviral vectors which are known to be particularly 
5 useful recombinant tools. 

The origin of replication may be provided either by 
construction of the vector to include an exogenous 
origin, such as may be derived from SV4 0 or other viral 
10 (e.g., Polyoma, Adeno, VSV, BPV) source, or may be 
provided by the host cell chromosomal replication 
mechanism. If the vector is integrated into the host 
, cell chromosome, the latter is often sufficient. 

15 Biological Functional Equivalents 

As is known in the art, modification and changes may 
be made in protein structure and still obtain a molecule 
having like or otherwise desirable characteristics. For 
example, certain amino acids may be substituted for other 

20 amino acids in a protein structure without appreciable 

loss of interactive binding capacity with structures such 
as, for example, DNA, enzymes and substrate molecules. 
Since it is the interactive capacity and nature of a 
protein that defines that protein's biological functional 

25 activity, certain amino acid sequence substitutions can 
be made in a protein sequence (or, of course, its 
underlying DNA coding sequence) and nevertheless obtain a 
protein with like or even countervailing properties 
(e.g., antagonistic v. agonistic). The present invention 

3 0 thus encompasses MLL proteins and peptides including 
certain sequences changes. 

In making conservative changes, the hydropathic 
index of amino acids may be considered. The importance 
3 5 of the hydropathic amino acid index in conferring 

interactive biologic function on a protein is generally 
understood in the art (Kyte & Doolittle, 1982) and it is 
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known that certain amino acids may be substituted for 
other amino acids having a similar hydropathic index or 
score and still result in a protein with similar 
biological activity. Each amino acid has been assigned a 
5 hydropathic index on the basis of their hydrophobicity 

and charge characteristics, these are: isoleucine (+4.5); 
valine (+4.2); leucine (+3.8); phenylalanine (+2.8); 
cysteine/cystine (+2.5); methionine (+1.9); alanine 
(+1.8); glycine (-0.4); threonine (-0.7); serine (-0.8); 

10 tryptophan (-0.9); tyrosine (-1.3); proline (-1.6); 

histidine (-3.2); glutamate (-3.5); glutamine (-3.5); 
aspartate (-3.5); asparagine (-3.5); lysine (-3.9); and 
arginine (-4.5). In making changes, the substitution of 
amino acids whose hydropathic indices are within ±2 is 

15 preferred, those which are within ±1 are particularly 
preferred, and those within ±0.5 are even more 
particularly preferred. 

Substitution of like amino acids can also be made- on 
20 the basis of hydrophilicity , particularly where the 
biological functional equivalent protein or peptide 
thereby created is intended for use in immunological 
embodiments. U.S. Patent 4,554,101, incorporated herein 
by reference, states that the greatest local average 
25 hydrophilicity of a protein, as governed by the 

hydrophilicity of its adjacent amino acids, correlates 
with its immunogenicity and antigenicity, i.e. with a 
biological property of the protein. 

30 As detailed in U.S. Patent 4,554,101, the following 

hydrophilicity values have been assigned to amino acid 
residues: arginine (+3.0); lysine (+3.0); aspartate (+3.0 
± 1); glutamate (+3.0 ± 1); serine (+0.3); asparagine 
(+0.2); glutamine (+0.2); glycine (0); threonine (-0.4); 

35 proline (-0.5 ± 1); alanine (-0.5); histidine (-0.5); 
cysteine (-1.0); methionine (-1.3); valine (-1.5); 
leucine (-1.8); isoleucine (-1.8); tyrosine (-2.3); 
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phenylalanine (-2.5); tryptophan (-3.4). It is 
understood that an amino acid can be substituted for 
another having a similar hydrophilicity value and still 
obtain a biologically equivalent, and in particular, an 
5 immunologically equivalent protein. In such changes, the 
substitution of amino acids whose hydrophilicity values 
are within ±2 is preferred, those which are within ±1 are 
particularly preferred, and those within ±0.5 are even 
more particularly preferred. 

10 

As outlined above, amino acid substitutions are 
generally therefore based on the relative similarity of 
the amino acid side-chain substituents , for example, 
their hydrophobicity , hydrophilicity, charge, size, and 

15 the like. Exemplary substitutions which take various of 
the foregoing characteristics into consideration are well 
known to those of skill in the art and include: arginine 
and lysine; glutamate and aspartate; serine and 
threonine; glutamine and asparagine; and valine, leucine 

20 and isoleucine. 



While discussion has focused on functionally 
equivalent polypeptides arising from amino acid changes, 
it will be appreciated that these changes may be effected 
25 by alteration of the encoding DNA; taking into 

consideration also that the genetic code is degenerate 
and that two or more codons may code for the same amino 
acid. 

3 0 Antibody Generation 

As disclosed hereinbelow (see Example IV) , now that 
the inventors have made possible the production of 
various MLL proteins, the generation of antibodies is a 
relatively straightforward matter. Antibody generation 
3 5 is generally known to those of skill in the art and many 
experimental animals are available for such purposes. 
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In addition to the polyclonal antisera described 
herein, the inventors also contemplate the production of 
specific monoclonal antibodies. Monoclonal antibodies 
(MAbs) specific for the MLL protein of the present 
5 invention may be prepared using conventional techniques. 
Initially, an MLL-containing composition would be used 
to immunize an experimental animal, such as a mouse, from 
which a population of spleen or lymph cells would be 
obtained. The spleen or lymph cells would then be fused 
10 with cell lines, such as human or mouse myeloma strains, 
to produce antibody-secreting hybridomas. These 
hybridomas may be isolated to obtain individual clones 
which can then be screened for production of antibody to 
the desired MLL protein. 

15 

For fusing spleen and myeloma or plasmacytoma cells 
to produce hybridomas secreting monoclonal antibodies 
against MLL, any of the standard fusion protocols may be 
employed, such as those described in, e.g., The Cold 

2 0 Spring Harbor Manual for Hybridoma Development, 

incorporated herein by reference. Hybridomas which 
produce monoclonal antibodies to the selected MLL antigen 
would then be identified using standard techniques, such 
as ELISA and Western blot methods. Hybridoma clones can 

25 then be cultured in liquid media and the culture 

supernatants purified to provide MLL-specific monoclonal 
antibodies . 

Epitopic Core sequences 

30 The present invention also makes possible the 

identification of epitopic core sequences from the MLL 
protein, as based on the deduced amino acid sequence 
encoded by the MLL gene. The identification of MLL 
epitopes directly from the primary sequence, and their 

35 epitopic equivalents, is a relatively straightforward 
matter known to those of skill in the art. In 
particular, it is contemplated that one would employ the 
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methods of Hopp, as taught in U.S. Patent 4,554,101, 
incorporated herein by reference, which teaches both the 
identification of epitopes from amino acid sequences on 
the basis of hydrophilicity , and the selection of 
5 biological functional equivalents of such sequences. The 
methods described in several other papers, and software 
programs based thereon, can also be used to identify 
epitopic core sequences, for example, the Jameson and 
Wolf computer programs and the Kyte analyses may also be 
10 employed (Jameson & Wolf, 1988; Wolf et al . , 1988; Kyte & 
Doolittle, 1982) . 

The amino acid sequence of an "epitopic core 
sequence" thus identified may be readily incorporated 
15 into peptides, either through the application of peptide 
synthesis or recombinant technology. As mentioned above, 
preferred peptides for use in accordance with the present 
invention will generally be on the order of 15 to 50 
amino acids in length, and more preferably about 15 to 

2 0 about 3 0 amino acids in length. It is proposed that 

shorter antigenic peptides which incorporate epitopes of 
the MLL protein will provide advantages in certain 
circumstances, for example, in the preparation of 
antibodies or in immunological detection assays. 
25 Exemplary advantages include the ease of preparation and 
purification, the relatively low cost and improved 
reproducibility of production, and advantageous 
biodistribution. 

3 0 The MLL Gene 

The present inventors recently identified a yeast 
artificial chromosome (YAC) that contains the breakpoint 
region in leukemias with the nost common reciprocal 
translocations involving this chromosomal band, namely 
35. t(4;ll), t(6;ll), t(9;ll), and t(ll;l9), (Rowley et al . , 
1990) . They identified a gene termed MLL, for mixed 
lineage leukemia or myeloid/ lymphoid leukemia, that spans 
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the breakpoint on llq2 3 (Ziemin-van Der Poel et al . , 

1991) . This same gene is also referred to as ALL-1 
(Cimino et al . , 1991; Gu et al . , 1992a;b) , Htrx (Djabali 
et al., 1992) and tfi?X (Tkachuk et al * , 1992) by other 

5 workers in the field, although MLL is the accepted 

designation for this gene adopted by the human genome 
nomenclature committee (Chromosome Co-ordinating Meeting , 

1992) . 

Recent data indicate that the breakpoint in a cell 
line, RC-K8 with a t (11;14) (q23 ;q32) , is approximately 
110 kb telomeric to the breakpoint in other llq23 
translocations which involve the MLL gene (Akao et al., 
1991b; Lu & Yunis, 1992; Radice & Tunnaclif f e, 1992). 
The present inventors propose that there are at least two 
different regions of band q23 involved in chromosome 
llq2 3 translocations; and distinguish these by using the 
term more centromeric to designate MLL rearrangements 
from those involving the more telomeric breakpoint - 
which has been described as the RCK locus (Akao et al . , 
1991b) or the p54 gene (Lu & Yunis, 1992) . 

Using pulse field gel electrophoresis analyses, the 
breakpoint region in MLL was mapped to a 92 kb NotI 

2 5 fragment approximately 100 kb telomeric to the CD3G gene. 

Non-repetitive sequences from three genomic clones 
isolated from this region detected transcripts in the 
estimated 11-12.5 kb size range (normal mRNA) in normal 
cells, and in the cell line, RS4;11 with a t(4;ll), two 

3 0 highly expressed transcripts whose estimated size was 

11.0 and 11.5 kb (rearranged mRNA) were detected (Ziemin- 
van Der Poel et al . , 1991). It should be noted that the 
size of these transcripts has been estimated from 
measurements on Northern blots. In this size range, 
35 i.e., above about 10 kb, the resolution of agarose gels 

is known to be poorer, and hence size determinations made 
in this manner may be over- or under-estimates, and be 
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found to vary about 2 or 3 kb or so, as has been reported 
by other groups for the MLL gene (Cimino et al . , 1991; 
1992) . 

5 Improved MLL Probes 

Presented herein is evidence that the breakpoints in 
the t(4;ll), t(6;ll), t(9;ll), and t(ll;19) 
translocations are clustered within a 9 kb BamHI genomic 
region of the MLL gene, which has been more precisely 

10 defined, by sequencing, as being 8*3 kb in length. Using 
a C.7 kb BamKl cDNA fragment of the MLL gene called MLL 
0.7B (seq id no:l), rearrangements on Southern analyses 
of DNA from cell lines and patient material with an llq2 3 
translocation were detected in this region. Probe MLL 

15 0.7B (seq id no:l) is derived from a cDNA clone that 

lacks Exon 8 sequences, but this clearly has no adverse 
effects on breakpoint detection using this probe, which 
is still the most advantageous probe identified to date. 

20 

Northern blotting analyses of the MLL gene are also 
presented herein. These results demonstrate that the MLL 
gene has multiple transcripts, some of which appear to be 
lineage specific. In normal pre-B cells, four normal 

2 5 mRNA transcripts estimated to be of about 12.5, 12.0, 

11.5 and 2.0 kb in size are detected. These transcripts 
are also present in monocytoid cell lines with additional 
hybridization to an estimated 5.0 kb normal mRNA 
transcript, indicating that expression of different sized 
30 MLL transcripts may be associated with normal 
hematopoietic lineage development. 

In a cell line with a t(4;ll), the expression of the 
large 12.5, 12.0 and 11.5 kb transcripts is reduced, and 

3 5 there is evidence of three other altered mRNA transcripts 

estimated to be of 11.5, 11.25 and 11.0 kb. In the 
Karpas 45 cell line (K45) , with a t (X;ll) (q!3 ;q23) 
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translocation, aberrant mRNA transcripts with estimated 
sizes of about 8 )cb and about 6 kb, were detected. These 
translocations result in rearrangements of the MLL gene 
and may lead to altered function (s) of the MLL gene as 
5 well as that of other gene(s) involved in the 
translocation. 

In further studies, unique sequences from the 0.7 
kilobase BamKX fragment, corresponding to the centromeric 

10 and telomeric ends of the 8.3 kilobase germline fragment, 
were amplified by the polymerase chain reaction (PCR) and 
were used as probes to distinguish the chromosomal origin 
of rearranged bands on Southern blot analysis. Patient 
samples were selected on the basis of a karyotype 

15 containing an llq23 abnormality and the availability of 
cryopreserved bone marrow or peripheral blood. 61 
patients with acute leukemia and llq23 aberrations, three 
cell lines derived from such patients, and 20 patients 
with non-Hodgkins lymphomas were analyzed. 

20 

It was found that the 0.7 kilobase cDNA fragment 
(seq id no:l) detected DNA rearrangements with a single 
BamHI digest in 58 leukemia patients and three cell .lines 
with llq23 abnormalities- This includes all cases (46 

25 patients and two cell lines) with the common llq2 3 

translocations involving chromosomes 4, 6, 9, and 19. In 
addition, rearrangements were identified in 16. other 
cases with llq23 anomalies, including translocations, 
insertions, and inversions. Rearrangements were not 

3 0 detected in three patients with leukemia and uncommon 
llq23 translocations. Three of the 20 patients with 
lymphoma also had rearrangements. All of these breaks 
are first shown to occur within a 9 kilobase breakpoint 
cluster region, later identified as occurring within a 

3 5 region only 8.3 kb in length. Nineteen different 

chromosome breakpoints were associated with the MLL gene 
in these rearrangements, suggesting that MLL is 
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juxtaposed to 19 different genes. In 70% of these cases, 
two rearranged bands, corresponding to the two derivative 
chromosomes, were detected and in 3 0%, only one 
rearranged band was present. In cases with only one 
5 rearranged band, it was always detected by only the 

centromeric probe. Thus, the sequences centromeric to 
the breakpoint are always preserved, whereas, telomeric 
sequences are deleted in 30% of cases. 

It can be clearly seen that the 0.7 kilobase cDNA 
probe of the present invention detects rearrangements on 
Southern blot analysis with a single BamHI restriction 
digest in all patients with the common llq23 
translocations. The same breakpoint occurs in at least 
14 other llq23 anomalies. The breaks were all found to 
occur in a 9 kilobase breakpoint cluster region within 
the MLL gene later shown, by sequencing, to be an 8 . 3 kb 
region. The present inventors have, therefore, developed 
specific probes that can distinguish between the two 
derivative chromosomes. In cases with only one 
rearranged band, the exon sequences immediately distal to 
the breakpoint are deleted. This cDNA probe will be very 
useful clinically both in diagnosis of rearrangements of 
the MLL gene as well as in monitoring patients during the 
course of their disease. 

The following examples are included to demonstrate 
preferred embodiments of the invention. It should be 
appreciated by those of skill in the art that the 
3 0 techniques disclosed in the examples which follow 
represent techniques discovered by the inventor to 
function well in the practice of the invention, and thus 
can be considered to constitute preferred modes for its 
practice. However, those of skill in the art should, in 
3 5 light of the present disclosure, appreciate that many 

changes can be made in the specific embodiments which are 
disclosed and still obtain a like or similar result 
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without departing from the spirit and scope of the 
invention . 

5 EXAMPLE I 

Cloning of cDNAs of the MLL Gene that Detect DNA 
Rearrangements and Altered RNA Transcripts in 
Human Leukemic Cells with llg23 Translocations 

10 1. Materials and Methods 

CELL LINES AND PATIENT MATERIAL, The 
characterization of the cell lines RS4;11, RCH-ADD (an 
EBV transformed cell line with a normal karyotype from a 

15 patient with leukemia and a t(l;19)), SUP-T13, U937 and 
RC-K8 have been described (Stong & Kersey, 198 5; Jack et 
al . , 1986; Smith et al . , 1989; Kubonoshi et al ♦ , 1986; 
Sundstrom & Nilsson, 1976) • The clinical and cytogenetic 
characteristics of the patient material and cell lines 

20 with llq23 translocations are listed in Table 1. 
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PREPARATION AND SCREENING OF A cDNA LIBRARY. 
Poly (A) + RNA was isolated from a monocytic cell line 
(U93 7) using the Fast Track Isolation mRNA Kit 
(Invitrogen) , and a custom random primed and oligo-d(T) 
5 primed cDNA library was made by Stratagene. A cDNA 

library with a titre of 1.4 xlO 6 pfu/ml cloned into the 
EcoRl site of Lambda Zap II was obtained. One half 
million plaques were plated and hybridized separately 
with two 32 P labelled probes, a 2 . 1 kb BamKX/Sstl fragment 

10 from the telomeric end of genomic clone 14 (Ziemin-van 
Der Poel et al . , 1991) referred to as 14BS and a 0.8 kb 
PstI fragment from the centromeric end, 14P (Fig. 1) . 
Labeling and hybridization protocols were as previously 
described (Shima et al . , 1986). Positive clones were 

15 purified and subcloned into the Bluescript vector using 
the in vivo plasmid excision protocol (Stratagene) . 
Clones were characterized by Southern blot hybridization 
and were subsequently mapped and sequenced using the 
Sequenase Kit (United States Biochemical) . 

20 

NORTHERN AND SOUTHERN ANALYSES. DNA was extracted 
from both cell lines and from patient material. Ten 
micrograms of each sample was digested with restriction 
enzymes, separated on agarose gels and transferred to 

25 nylon membranes. Poly (A) + RNA was extracted from 100 x 
10 6 cells in logarithmic or stationary growth phase using 
the Fast Track Isolation Kit (Invitrogen) . Five 
micrograms of formamide/ formaldehyde denatured RNA was 
electrophoresed on a 0.8% agarose gel at 4 0 volts/ cm for 

3 0 16 or 20 hours and transferred to nylon membranes. 

Hybridization and labeling protocols were as described 
previously (Shima et al . , 1986). 
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2. Results 

cDNA Clones 

Using a non-repetitive sequence called 14BS (2.1 kb) 
5 (Fig. 1) from the telomeric end of genomic clone 14 

(Ziemin-van Der Poel et al . , 1991), the present inventors 
detected two cDNA clones 14-7 (1.3 kb) and 14-9 (1.4 kb) . 
Mapping and sequencing of these two clones, revealed 
approximately 0.5 kb of homology, and clone 14-9 

10 contained a long stretch of Alu repeats. Clone 14-7 had 
an open reading frame (ORF) , that extended for the entire 
insert length with a predicted direction of transcription 
of MLL from centromere to telomere. Using a unique 
centromeric fragment, 14P (0.8 kb) , of clone 14, three 

15 additional cDNA clones were obtained; namely 14P-18A 
(1.1 kb) , 14P-18B (4.1 kb) and 14P-18C (2.0 kb) . The 
relationship of all these clones is clearly set forth in 
Fig. 1. The organization of the genomic segment is shown 
in Fig. 9 and the entire 8 . 3 kb genomic region is 

20 represented by seq id no:6. cDNA clone 14P-18B (seq id 
no: 4) differs from the published sequence of Tkachuk et 
al. (1992) in that clone 14P-18B lacks exon 8 sequences. 



Sequence analyses indicated that the cDNA clone 14P- 
18A is completely contained in 14P-18B, while the region 
of homology of 14P-18B with 14P-18C is only 0.2 kb. As 
is the case with clone 14-9, 14P-18C also contains 
stretches of Alu repeats. All of the cDNA clones were 
hybridized to Southern blots with genomic DNA digested 
with a range of restriction enzymes and Fig. 1 shows the 
alignment of the BajnHl sites in the cDNA clones to 
approximately 50 kb of genomic sequence. The genomic 
BajnHl sites are the same as those reported by Cimino et 
al (1992) for this same gene which they term ALi-2. The 
Sail and Sstl sites in the cDNA clones and the genomic 
sequence were related by hybridization to Southern blots 
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of the BamHll 14 kb genomic fragment. Aligning clone 14- 
7 with clone 14P-18B indicates that this is an almost 
continuous cDNA seguence of 5,4 kb of the MLL gene, 

5 Southern Analyses 

Southern blots of DNA from control samples, cell 
lines and patient material with llq23 translocations were 
hybridized to an internal 0.7 kb BamUI fragment of 14P- 
18B termed MLL 0.7B, and subsequently referred to as 0.7B 

10 (Fig. 2) . This probe detects a 9 kb BaraHI germ line 

band, and also detects DNA rearrangements in samples with 
a t(4;ll), t(6;ll), t(9;ll), and t(ll;19) tested to date 
(Fig. 3 and Example II) . In most of the samples tested, 
this probe detected two rearranged bands indicating 

15 hybridization to both derivative chromosomes. In the 
cell line SUP-T13 which has a t(ll;19) this 0.7B probe 
hybridized very weakly to at least two rearranged bands 
suggesting a deletion which includes DNA sequences 
homologous to the probe (Fig. 3, lane 6). In the RC-K8 

20 cell line with a t(ll;14) (Fig. 3, lane 8), no 
rearrangement was detected. 

Northern Analyses 

To determine the nature of the transcripts detected 

2 5 by the cloned cDNAs, sequential hybridizations to the 

same Northern blots were performed. The cDNA clones used 
were 14-7, and three adjacent fragments of the cDNA clone 
14P-18B, namely a 0.3 kb BamHl/EcoRl fragment termed MLL 
0.3BE (0.3BE), a 0.7 kb BamHl fragment (MLL 0.7B, or 

30 0.7B), and a 1.5 kb EcoRl/BamHl fragment termed MLL 1.5EB 
or 1.5EB (Fig. 2). These fragments are cDNAs that are 
telomeric, span and are centromeric to the breakpoint 
junction, respectively. It should be noted that the 
EcoRl site used to excise the 1.5 kb fragment was a 

35 cloning site. 
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The most telomeric cDNA clone 14-7, detected two 
large transcripts of 12.0 and 11.5 kb in normal cell 
lines (EBV immortalized B cells) and in the cell line RC- 
K8 (Fig. 4A panel a). However, in the RS4;11 cell line 
5 three transcripts of estimated sizes 12.0, 11.5 and 

11.0 kb were evident (Fig. 4B panel a). There was only 
weak hybridization to the normal 12.0 and 11.0 kb message 
in the latter sample, while the 11.5 kb transcript was 
expressed in high abundance (Fig. 4a where act in is used 
10 as a control probe). The ratio of expression of the 11.5 
and 11.0 kb transcripts in the RS4;11 cell line was 
dependent upon the state of cell growth when RNA was 
extracted, (compare Figs. 4A panel a, and 4B panel a). 

15 On separate hybridizations with all three of these 

fragments (0.3BE, 0.7B and 1.5EB) of clone 14P-18B, the 
estimated 12.0 and 11.5 kb transcripts were detected in 
normal cell lines (Fig. 4A, panel a-c) . The 0.3BE probe 
also detected a normal 2.0 kb transcript which was 

20 expressed in all cell lines tested so far. In monocytoid 
cell lines the 0.3BE probe detected an additional 
transcript of 5.0 kb. In addition to hybridization to 
the estimated 12.0 and 11.5 kb transcripts in normal cell 
lines, the most centromeric 1.5EB probe detected the 

2 5 large 12.5 kb transcript, which the present inventors 

have described as a MLL transcript that spans the 
breakpoint (Ziemin-van Der Poel et al . , 1991). 

It is important to stress that the size 

3 0 determination of larger sized nucleic acids using 

Northern blotting is not always completely accurate. In 
the size range of about 9-10 kb, and above, it is known 
that the poorer resolution of agarose gels can lead to 
the over- or under-estimat ion of transcript size. Such 
35 determinations may even differ by up to about 2 kb or so. 
Therefore, it will be understood that all references to 
size determinations in the results and discussions which 
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follow are the currently best available estimate of the 
transcript size, and may not precisely correlate with the 
size determined by other means, such as, for example, by 
direct sequencing. 

5 

In the RS4/11 cell line, there was evidence of 
differential hybridization of these probes to 
transcripts. Figure 4B shows a Northern blot with RNA 
from the RS4;11 cell line electrophoresed for 20 hours to 

10 obtain better resolution of the large size transcripts. 
The 0.3BE probe hybridized very strongly to the over- 
expressed rearranged 11.5 kb and the 11. 0 kb transcripts 
with weak hybridization to a transcript of 12.0 kb. 
There was also hybridization to the two smaller normal 

15 transcripts of 5.0 and a 2.0 kb (Fig. 4B panel b) . The 
adjacent 0.7B probe which detected DNA rearrangements in 
cells with llq2 3 translocations, hybridized to the over- 
expressed 11.5 kb and 11.0 kb rearranged transcripts with 
weak hybridization to the normal 12.0 kb transcript. as 

2 0 above. However, this 0.7B probe also detected a 

rearranged mRNA transcript estimated to be 11.25 kb (Fig. 
4B panel c) in these cells with a t(4;ll). Finally, the 
1.5EB probe which is centromeric to the breakpoint 
junction also detected this rearranged 11.25 kb 
25 transcript with weak hybridization to the normal 12.5, 
12.0 and 11.5 kb transcripts (Fig. 4B panel d) . Of 
notable exception, this 1.5EB probe did not detect the 
over-expressed 11.5 kb transcript and the 11.0 kb 
transcript in the RS4;11 cell line. The detection of 

3 0 different mRNA transcripts by these probes is summarized 

in Table 2, and also represented graphically in Figure 5. 
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3 . Discussion 

The inventors have isolated several cDNA clones of 
the MLL gene of which the internal 0.7 kb BamUl fragment 
5 of cDNA clone 14P-18B (0.7B) detected rearrangements in 

leukemic samples with the centromeric llg2 3 translocation 
(Fig. 3 and Example II) . The data presented herein 
indicate that the breakpoints in band llq2 3 in the common 
translocations which involve chromosomes 4, 6, 9 and 19 

10 are clustered within an 8.3 kb region of the MLL gene. 
In many of the samples, this probe detected two 
rearranged bands indicating hybridization to both 
derivative chromosomes. This implies that this 0.7B 
fragment contains DNA sequences from both ends of the 

15 9 kb BamHI genomic fragment, see also Example II. 

DNA rearrangements were not detected in the RC-K8 
cell line which has a t (11 ; 14 ) (g23 ;q32 ) , which further 
confirms the existence of at least two distinct 

20 breakpoint regions in llq23 (Rowley et al . , 1990; Akao et 
al., 1991b; Lu & Yunis, 1992; Radice & Tunnacliffe, 
1992) . One is the more centromeric region and involves 
the MLL gene; whereas the other is at least 110 kb 
telomeric and includes the breakpoint seen in the RC-K8 

25 cell line (Akao et al . , 1991b; Lu & Yunis, 1992; Radice & 
Tunnacliffe, 1992) . Furthermore Lu and Yunis have 
determined that the 5' non coding region of the p54 gene 
is split in this more telomeric llq23 translocation, 
which indicates that the p54 gene is different from MLL. 

30 

Figure 1 shows the alignment of the cDNAs to genomic 
sequences which span approximately 50 kb. The largest 
cDNA, 14P-18B is 4.1 kb, and it is located centromeric to 
clone 14-7 to give 5.4 kb of almost continuous cDNA 
3 5 sequence. The inventors have therefore cloned more than 
one third of the 11.0, 11.5, 12.0 and 12.5 kb transcripts 
of the MLL gene. Two other cDNAs, 14P-18C and 14-9, 
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contain Alu repetitive sequences and share limited 
homology with 14P-18B and 14-7 respectively (Fig. 1) . 
This indicates that these cDNAs are derived either from 
different transcripts or are derived from incompletely 
5 processed transcripts* It is now known that virtually 
all 12.5 to 15.0 kb of the MLL gene is an open reading 
frame and that there is homology between MLL and the zinc 
finger region of the Drosophila trithorax gene (Tkachuk 
et al . , 11992; Gu et al . , 1992a). 

10 

Use of fragments of the cDNA clones in Northern 
hybridizations provided evidence of a range of MLL 
transcript sizes in different hematopoietic lineages as 
well as of alternative exon splicing of the MLL gene 

15 transcripts. The normal transcripts, estimated to be 

2.0, 11.5, 12.0 and 12.5 kb in length, are expressed in 
both hematopoietic and non-hematopoietic tissues. The 
5.0 kb transcript is detected in monocytic cell lines and 
in the T-cell line tested. The level of expression of 

20 the 5.0 kb transcript in the RS(4;11) cell line is 

approximately 50% of that expressed in the monocytic cell 
lines. This result may reflect the biphenotypic nature 
of this cell line which has both pre-B-cell and 
monocytoid features . 

25 

Northern blot analyses using the 14-7 probe (which 
is telomeric to the breakpoint region) detected the two 
large transcripts of 12.0 and 11.5 kb in control B cells 
and in the RC-K8 cell line. In the RS4;11 cell line, 

3 0 this probe detected a weak signal at 12.0 kb with strong 
hybridization to an 11.5 kb transcript. This probe also 
detected an additional smaller transcript of 11.0 kb in 
the RS4;ll cell line (Fig. 4B panel a). The 12.0 and 
11.0 kb transcripts appear to be in low abundance while 

3 5 the 11.5 kb transcript is over-expressed. The relative 
ratio of hybridization of the estimated 11.5 and 11.0 kb 
rearranged mRNA transcripts varies with the growth phase 
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of the RS4;ll cells prior to RNA extraction. In 
logarithmic growth phase, the ratio of the two signals is 
approximately 3:1, whereas in stationary phase, the 
11.0 kb transcript is hardly discernible (Figs. 4 A and 
5 4B, panel a) . 

To define more precisely the nature of the 
transcripts detected in control cell lines and in the 
cell line with the t(4;ll), three adjacent fragments of 

10 clone 14P-18B (Fig. 2) were hybridized sequentially to 
the same Northern blots (Fig. 4A,4B). All of the probes 
detected the 12.0 and 11.5 kb transcripts in normal 
cells. The most centromeric 1.5EB probe also detected a 
12 . 5 kb transcript on very long exposure of 

15 autoradiograms . These three transcripts are normal MLL 

transcripts which cross the llq2 3 breakpoint region. The 
fact that the 1.5EB probe is the only fragment of the 
4.1 kb 14P-18B cDNA clone that detects the large 12.5 kb 
transcript indicates the existence of alternative exon 

2 0 splicing. To date, the only other cDNA clones which 

detect this transcript are 14-9 and 14P-18C. These cDNA 
clones contain Alu repeats, which might indicate the 
presence of intron sequences in incompletely processed 
MLL transcripts. 

25 

On sequential hybridization of these three fragments 
to Northern blots of RNA from the RS4;11 cell line there 
was evidence of weak hybridization to the normal 12.5, 
12.0 and 11.5 kb transcripts, all of which cross the 
30 breakpoint (Fig. 4A,4B). The present inventors now have 
evidence that the over-expressed 11.5 kb transcript in 
the RS4;11 cell line is not the same as the normal 
11.5 kb transcript. The 1.5EB probe detects the normal 
11.5 kb transcript in control cells, however there is 

3 5 only a weak hybridization signal to an 11.5 kb transcript 

in the RS4;11 cell line (Fig. 4A, panel c) . This weak 
hybridization is proposed to be detection of. the normal 
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ll .5 kb transcript, and is a different transcript from 
the over-expressed 11.5 kb transcript which is detected 
with all the other more telomeric probes. These data 
indicate that t the weakly hybridizing 11.5 kb transcript 
5 detected by the 1.5EB probe, is one of the three normal 
12.5, 12.0 and 11.5 kb MLL transcripts that cross the 
breakpoint. The reduced expression of all these three 
transcripts in the RS4;11 cell line may be due to 
transcription from only the normal chromosome 11. 
10 Therefore, the over-expressed 11.5 kb transcript which 

was detected with the more telomeric probes is an altered 
MLL transcript derived from the der(4) chromosome (Fig. 
4B panel a-c) . 

15 There was evidence of two other altered MLL 

transcripts of 11.25 and 11.0 kb in the RS4;11 cell line. 
The origin of these two transcripts was easier to define 
as there was no hybridization to transcripts of these 
sizes in RNA from normal cells. The 11.25 kb transcript 

2 0 was detected with the centromeric 1.5EB probe and the 

0.7B probe that contains sequences that span the 
breakpoint, and thus suggests that it originates in the 
der(ll) chromosome (Fig. 4B panel c,d). The 11.0 kb 
transcript was detected with the same three probes (14-7, 
25 0.3BE and 0.7B) as the aberrant 11.5 kb transcript and is 
probably derived from the der(4) chromosome (Fig. 4B 
panel a-c) according to the scheme in Fig. 5. Thus the 
inventors have developed cDNA probes for the MLL gene 
which permit detection of three altered transcripts of 

3 0 MLL arising from both derivative chromosomes in a cell 

line with a t(4;ll). 

In recent reports by Croce and colleagues (Cimino et 
al. 1991; 1992; Gu et al . 1992a) a genomic clone which 
35 was 10 kb centromeric to the breakpoint region, detected 
a major transcript said to be about 12.5 kb and a minor 
11.5 kb transcript with additional hybridization to an 
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11-0 kb species which was only found in cell lines with a 
t(4;ll) . This 11.0 kb transcript may be the same as the 
altered 11.25 kb MLL transcript detected in the RS4;11 
cell line using the 0.7B and 1.5EB cDNA probes. The 
5 inventors propose that this transcript is from the 

der(ll) chromosome. The discrepancy in size between the 
transcript detected in this study and that of Cimino et 
al may be due to poor resolution of transcripts of this 
large size. Using the centromeric genomic probe, Cimino 
10 et al . (1992) also reported hybridization to 0.4 and 

5.0 kb transcripts in a variety of cell lines which were 
not found in the present study. 

In summary the cDNA and Northern analyses indicate 
15 that the MLL gene is a large complex gene with numerous 
transcript sizes. In analyses of the transcripts in the 
RS4;11 cell line, the inventors found that there is 
reduced expression of the normal MLL transcripts of 12*5, 
12.0 and 11.5 kb, and that (Heim & Mitelman, 1987) the 
20 over-expressed 11.5 kb transcript and the 11.0 kb 

transcript as well as the 11.25 kb transcript specific to 
the RS4;11 cell line are altered MLL transcripts arising 
from the translocation derivative 4 and derivative 11 
chromosomes respectively. How, or if, these three 
25 altered transcripts of the MLL gene alter normal MLL 
protein expression and function and contribute to 
leukemogenesis is still unknown. 

A major question in reciprocal translocations is 
3 0 which derivative chromosome contains the critical 

junction. Analysis of complex translocations indicate 
that, for these llq23 translocations, it is the der(ll) 
chromosome. The Southern blot analysis of patient data, 
as presented in Example II, supports this interpretation. 
3 5 Because the direction of transcription of MLL is from 
centromere to telomere, the juxtaposition of the 5' 
sequences and the 5' flanking regulatory regions of MLL 
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remaining on the der(ll) to various other genes on other 
chromosomes may play an important role in all of these 
leukemias. The fact that this translocation is 
associated with lymphoid and myeloid leukemias suggests 
5 that the regulated expression of the MLL gene may be 
important in normal hematopoietic lineage specificity, 
and that rearrangements of this gene play a critical role 
in the oncogenic process of these leukemias. 

10 

EXAMPLE II 

A cDNA Probe Detects All Rearrangements of the MLL Gene 
in Leukemias with Common and Rare llq2 3 Translocations 

15 This example concerns the identification of a 

restriction fragment from a cDNA clone which detects 
rearrangements in all cases of the t(4;ll), t(6;ll), 
t(9;ll), and both types of t(ll;19) examined as well as 
in many rare translocations with a breakpoint at band 

20 llq23. A key feature of this fragment is that it 

contains exons that flank the breakpoints in all of these 
cases. The present inventors have thus delineated an 
8.3 kilobase breakpoint cluster region in the common and 
rare translocations involving llq23. In addition, 

25 through the use of probes amplified by the polymerase 

chain reaction (PCR) from the centromeric and telomeric 
portions of this cDNA fragment, the present invention 
provides methods and compositions for the use in 
distinguishing between the two derivative chromosomes. 

3 0 Moreover, this example provides further data to support 

the hypothesis that the derivative 11 chromosome contains 
the critical translocation junction. 
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1. Materials and Methods 

PATIENTS AND CELLS LINES. Patient samples were 
obtained from the University of Chicago Medical Center, 
5 Saitama Cancer Center, Southwest Biomedical Research 
Institute, and Memorial Sloan-Kettering Cancer Center. 
The samples were selected on the basis of a karyotype 
containing an llq23 abnormality and the availability of 
cryopreserved leukemic bone marrow or peripheral blood. 
10 The cell line RS4;11 was a gift from J. Kersey at the 

University of Minnesota; (Stong & Kersey, 198 5) SUP-T13 
was a gift from S. Smith at the University of Chicago, 
(Smith et al . , 1989) and Karpas 45 was a gift from 
A. Karpas at Cambridge University (Karpas et al . , 1977). 

15 

CYTOGENETIC ANALYSIS. Cytogenetic analysis was 
performed using a trypsin-Giemsa banding technique. 
Chromosomal abnormalities were described according to the 
International System for Human Cytogenetic Nomenclature 
20 (Harnden & Klinger, 1985) . 

cDNA LIBRARY. A cDNA library was prepared from a 
monocytic cell line as described above in Example I. The 
library was screened with probes from the centromeric and 

2 5 telomeric ends of a 14 kilobase genomic BamUl fragment 

(clone 14) and several cDNA clones were obtained and 
mapped with restriction endonucleases . A 0.7 kilobase 
fragment called MLL 0.7B was isolated from a cDNA clone 
named 14P18C and used as described below. 

30 

MOLECULAR ANALYSIS. DNA was extracted from 
cryopreserved cells and digested with restriction 
enzymes, electrophoresed on 0.7% agarose gels, 
transferred to nylon membranes, and hybridized with 

3 5 radiolabeled cDNA probes at 4 2°C. All DNA blots were 

washed to a final stringency of IX SSC and 1% SDS at 65 °C 
prior to autoradiography. 
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SEQUENCE ANALYSIS . Nucleotide sequences were 
obtained by the dideoxy chain termination method with a 
double stranded DNA sequencing strategy using the 
Sequenase kit (United States Biochemical, Cleveland, OH). 

5 

- POLYMERASE CHAIN REACTION (PCR) . Amplification of 
unique sequences from the 0.7 kilobase BamHl fragment, 
corresponding to exons at the centromeric and telomeric 
ends of the 9 kilobase germline fragment, was performed 

10 using standard methods . 10 ng of cDNA were amplified in 
50 jil of reaction mix containing 1.5 mM MgCl 2/ 1.25 mM 
dNTPs, and 2.5 U of Taq polymerase. Reactions were 
performed in an automated thermal cycler (Perkin- 
Elmer/Cetus, Norwalk, CT) with denaturation at 92 °C for 

15 50 seconds, annealing at 50 °C for 50 seconds, and 
extension at 72 °C for one minute. 

2 . Results 

20 The inventors isolated a 0.7 kilobase BamHI cDNA 

fragment which is composed of exons flanking the 
centromeric and telomeric ends of an 8.3 kilobase genomic 
Ba^iHI fragment of the MLL gene (Example I, Figs. 1 and 
2). On Southern blot analysis, this 0.7 kilobase cDNA 

2 5 fragment, 0.7B, detected rearrangements of the MLL gene 

in 61 patients (58 with leukemia and three with lymphoma) 
and three cell lines (Fig. 6). This included all 48 
cases (4 6 patients and two cell lines) with the common 
translocations involving llq23 including the 

30 t(4;ll) (q21;q23) , t ( 6 ; 11) (q27 ;g23 ) , t (9 ; 11) (p22 ; q23 ) , 
t(ll;19) (q23;pl3.1) and t ( 11 ; 19) (q23 ;pl3 . 3 ) (Table 3). 



WO 93/25713 




PCI7US93/0S857 



CO 

w 
< 



cn 
~ r> 

i-t a 

iH ro 

i-l CM 
-P — 



iH (N 
4-> — 



n 

CM 

«h cr 

•*» CM 

<J\ CM 
— ' 

4J ^ 



^ CM 
»H D 1 
rH 

r- 

VD CM 

— tr 



en 
. CM 



cm 
— ' CP 

4J — 



in 



CM 



Q) 
C 

e 
x 

0) 

w 
c 

0) 
•H 
-P 

CO 



in 



CN 



-P c 

-H (1) 

> e 
a) 

in 



-p 
c 

Q) 



G 

nj 

-H U 
-P CO 
cd <D 

a. u 



CM 



00 



Q) 
CP 
C 

u 

CO 
CD 

^ en 

O C 
5 CO 



tn 



73 
0) 

c 

CtJ 

d) 

r4 

0) C 

C CO 

O XI 

in 



CM 



CO 



CM 



CM 



CM 



in 



CM 



CO 



1-3 
< 



c 

<D 

u 

r-» 

-H 

u 



(0 
4J 
rH 
D 

< 
o 

CM 



o 

U 

c 



-p 
o 
c 



I 

p 

CO 

c 

CO 



CD 
C 



0) 

o 
o 

-P 

s: 



WO 93/25713 _ _ PCT/US93/05857 



-62- 



W 

CQ 

< 




rH 


rH 


eg 


rH 


CN 


CN 


rH 




























co 














CN 














C tJ 1 














CO ^ 














rH 


ID 


rH 


U ] 


rH 


CN 


*■ * 


tr 1 


«— | 


pg 


rsj 


CN 


f 1 


CO 


CN N — * 


01 




D 1 


>T< 

W 


w 




CO > 




• 


• 








O 1 rH 


CO 


CO 


co 


m 


m 


• 


tH 


eg 


CN 


CN 


eg 


CN 


co 




CT 




tr 


CP 


CT 


CN 


CN 




























> o 


LH 






CO 


CN 




„ ^ 


rH 


rH 


rH 


rH 


CN 


rH 


^ (/) 












rH 


rH 


tH 


rH 


rH 


rH 


rH 




0) 


rH 


rH 


rH 


rH 


rH 


X 


00 > 




-P 


-P 


-P 


•P 


4_> 


■P -H 




































o 




























& 














r-l 














«H 




c 










iH 




•H 


e 








QJ 




iH 


o 








U 






■c 








1 




rH 










TJ 




r-i 


e 








0) 




CD 


>. 








X 




o 


rH 








•rH 














E 




in 


0] 






















Q) 






-P 


rH 


in 


CN 


W 


in 




H-> 




S 


E 






CO 


•H 


1 


l 


►J 


ff 


i 


rp 


rk 








•H 














Q 






CO 



o 

CN 



WO 93/25713 ^ ^ PCT/US93/05857 

-63- 

Also identified by the 0.7B probe were similar MLL 
gene rearrangements in DNA from 8 patients and one cell 
line with several less common llq2 3 translocations listed 
in Human Genome Mapping 11 (Table 3) (Mitelman et al . , 
5 1991). These include translocations involving lp32, 
lq21, 2p21, 17q21 / 17q25, Xql3, and three cases with 
insertion 10; 11. In addition, 7 other llq2 3 anomalies 
which have not been reported as recurring abnormalities, 
including translocations involving 6pl2, lOpll, 10q22 / 
10 15ql5, 18q21, and 22ql2, and one case with 

inv(ll) (ql4q2 3) j showed MLL rearrangements (Table 4) . 
The rearrangements detected in cell lines included RS4;ll 
with a t(4;ll), SUPT13 with a t(ll;19), and Karpas 45 
with a t(X;ll) (ql3;q23) . 

15 

The 0.7B MLL probe did not detect rearrangements in 
remission samples from patients who had rearrangements in 
the DNA from their leukemia cells. In addition, 
rearrangements were not identified in a few cases with 

20 uncommon llq2 3 translocations. These included AML 

patients with a t (4 ; 11) (q23 ;q23) , and a t (5 ; 11) (ql3 ;q23) , 
and an ALL with a t ( 10 ; 11) (pl3 ;q23 ) . However, and 
importantly, no patients were identified with the common 
llq2 3 translocations who failed to show rearrangements 

25 with the 0.7 kilobase cDNA fragment termed 0.7B. 

The age distribution of the leukemia patients in 
this series was broad; 11 patients were one year or less, 
16 were between the ages of two and 16, and 31 were 17 

30 years or older. There were 27 females and 31 males. The 
phenotype of the leukemias in these patients showed 28 
with ALL and 3 0 with AML. The cases with ALL and AML 
were indistinguishable by Southern blot analysis. In 70% 
of cases, two rearranged bands, corresponding to the two 

3 5 derivative chromosomes, were detected. Only a single 
rearranged band was detected in the remaining 3 0% of 
cases (Fig. 7) . To determine whether there were any 
potential correlations with the presence of one versus 
two rearranged bands, the patients were analyzed by 
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karyotypic abnormalities, phenotype of the leukemic 
cells, and by age. No significant associations between 
the number of rearranged bands and any of these subgroups 
were found. 

5 

In addition to these acute lymphoid and myeloid 
leukemias, 2 0 cases of non-Hodgkin' s lymphomas were also 
examined. Rearrangements were detected in three of these 
patients. This included one patient with a follicular 

10 small cleaved-cell lymphoma who had a karyotype which 

showed both a t ( 14 ; 18) (q32 ;q21) and a t ( 6 ; 11) (pl2 ; q23 ) , a 
patient with Burkitt's lymphoma whose karyotype included 
a t(8;14) (q24;q32) and an inv ( 11) (ql4q23 ) , and a patient 
with a diffuse mixed small cleaved cell and large cell 

15 lymphoma whose karyotype also included a trisomy 21. The 
other 17 lymphomas with llq2 3 abnormalities, primarily 
deletions and duplications, did not show rearrangements. 

To distinguish which derivative chromosome is 

2 0 represented by each of the rearranged bands on Southern 

blot analysis, sequences from the centromeric and 
telomeric portions of the 0.7 kilobase cDNA fragment, 
0.7B, were amplified by PCR to create distinct DNA 
probes. The centromeric PCR fragment detected the 
25 germline band and only one of the rearranged bands on 
Southern blot analysis. Thus, the rearranged band 
detected with this probe corresponds to the derivative 11 
[der(ll)] chromosome. The fragment amplified by PCR from 
the portion of the 0.7 kilobase cDNA fragment telomeric 

3 0 to the breakpoint was also hybridized to the same blots. 

The telomeric probe identified the germline band as well 
as the derivative chromosome of the other translocation 
partner. Clearly in cases with two rearranged bands, 
both derivative chromosomes are present. However, in the 
3 5 cases in which only one rearranged band is detected, it 

consistently is identified only by the centromeric probe. 
Therefore, the sequences immediately centromeric to the 
breakpoint are always preserved but the sequences distal 
to the breakpoint appear to be deleted in 3 0% of cases. 



WO 93/25713 9 9 PCT/US93/05857 

-65- 

In two of the patients (both Japanese) analyzed, a 
different pattern of hybridization was noted with the 
three probes employed. In one patient with a t(l;ll) and 
another with a t(4;ll), the 0.7 kilobase cDNA probe and 
5 the centromeric PCR probe both identified the same two 
rearranged bands (Fig- 8) . In all other cases, the 
centromeric PCR probe recognized only one of the two 
rearranged bands. In these two patients as in all other 
cases, the telomeric PCR probe detected only one of the 

10 two rearranged bands. Presumably, these breaks differed 
from the remainder of cases that were examined. Clearly, 
a portion of the exon sequences in these two patients, 
which in all other cases remains on the der(ll) , is 
translocated to the other derivative chromosome. The 

15 breaks may occur either within one or more exons on the 
centromeric side of the 8.3 kilobase genomic fragment or 
alternatively, if more than one exon is present, the 
breaks may occur within an intron separating these exons. 
Further analysis of the exon\ intron boundaries within the 

20 8.3 kilobase genomic BamHI fragment will allow the 
determination of the precise localization of these 
breakpoints . 



25 



3 . Discussion 



The present inventors have identified DNA 
rearrangements in 61 patients and three cell lines with 
llq2 3 abnormalities that affect the MLL gene and have 
delineated an 8 . 3 kilobase breakpoint cluster region 
30 within this gene using a 0.7 kilobase BamKX cDNA fragment 
(seq id no:l) as a probe. Rearrangements have been 
detected in all 48 cases examined with the t(4;ll), 
t(6;ll), t(9;ll), and both types of t(ll:19) as well as 
in 12. rare translocations, three insertions, and one 
35 inversion involving llq23. Rearrangements were also 

detected in three patients with non-Hodgkins lymphoma. 
These are the first cases of lymphoma that have been 
found to share the same breakpoint as the leukemias with 
llq23 translocations. While rearrangements are 
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detectable with multiple restriction enzymes, digestion 
with only a single enzyme, BamKI, was sufficient to 
identify each case with a rearrangement. in 70% of these 
cases, two rearranged bands, corresponding to the two 
derivative chromosomes, were identified and in 3 0%, only 
one band was present which we showed was derived from the 
der(ll) chromosome. 



10 



15 



20 



25 



30 



35 



The present study using the novel probes described 
above, particularly the 0.7 kb BamHI fragment, gave 
significantly improved results over all previously 
reported studies. For example, Cimino et al . described 
the identification of a 0.7 kb Ddel genomic fragment that 
detected rearrangements in a 5.8 kilobase region in 6 of 
7 patients with the t(4;ll), 4 of 5 with t(9;ll), and 3 
of 4 with the t(ll;l9) (Cimino et al . , 1991). m three 
of these 16 patients, two rearranged bands were detected 
and in the remainder, only one rearranged band was 
identified. Subsequently, they reported on an additional 
14 patients with this probe (Cimino et al . , 1992). In 
their combined series, this probe detected rearrangements 
in 26 of 30 cases (87%) with the t(4;li), t(9;li), and 
t(ll;i9). They hypothesize that the breaks in the 4 
cases that were not identified with their probe occur 
either at another site within this gene or at other loci 
in llq23. Assuming that the true incidence of 
rearrangements within the breakpoint cluster region in 
patients with the 5 common Hq23 translocations is 87%, 
then the likelihood, calculated by binomial 
probabilities, of identifying rearrangements in 4 8 of 48 
consecutive cases is 0.0014. Thus, the failure to detect 
rearrangements in those 4 cases by cimino and colleagues 
is likely due to the separation of these breaks from the 
genomic Ddel probe by a Ddel restriction site. 

Importantly, whereas the breakpoint in many cases 
with Iiq23 translocations may be contained within a 5.8 
kilobase genomic fragment, the breakpoint cluster region 
of the present invention encompasses a larger region of 
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8.3 kilobases and contains the breakpoints in all 
leukemia cases with the common translocations, as well as 
in all except three of the rare translocations examined. 

5 Pulsed field gel electrophoresis (PFGE) and 

fluorescence in situ hybridization (FISH) both have been 
used to map the region containing the llq2 3 breakpoints 
in leukemias (Savage et al . , 1988; 1991; Yunis et al . , 
1989; Tunnacliffe & McGuire, 1990). With FISH, the 

10 breakpoint lies telomeric to the CD3G gene and 

centromeric to the PBGD gene (Rowley et al . , 1990). With 
(PFGE) , the distance between the CD3G gene and the 
breakpoint in the t(4;ll) has been narrowed to 100-200 
kilobases (Das et al . , 1991). Chen et al . (1991) have 

15 shown by PFGE that there is a clustering of breakpoints 
in eight cases with the t(4;ll) and in two other patient 
samples with llq23 translocations but the size and 
location of this region could not be determined 
precisely. 

20 

Whereas the data presented herein and that of Cimino 
et al . (1991; 1992) indicate a clustering of breakpoints, 
several studies have suggested that the breakpoints on 
llq23 may be heterogeneous. Using cosmid probes and 

25 FISH, Cherif et al . (1992) found that one of their probes 
was proximal to the breakpoint in the t(ll;19) and distal 
to those in the t(4;ll), t(6;ll), and t(9;ll). Cotter et 
al . (1991) using PGR amplification of microdissected 
material from llq2 3 reported that the breaks in two 

30 t(6;ll) cases were proximal to the CD3D gene and that the 
breakpoints in the t(4;ll) and t(9;ll) were distal to 
this gene. 

Molecular studies have confirmed that the 
3 5 breakpoints in translocations involving the antigen 
receptor loci on chromosome 14 differ from the llq23 
translocations just discussed. Studies on the RCK8 B- 
cell lymphoma line which has a t (11;14) (q23;q32) showed 
that the immunoglobulin heavy chain constant region gene 
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and a gene called RCK were involved in the translocation 
(Akao et al . , 1990;1991a) . Mapping data indicate that 
RCK is over 100 kilobases telomeric to MLL (Radice & 
Tunnacliffe, 1992) . In addition, the present inventors 
5 cloned a t (11; 14) (q23 ;qll) from a patient with a null- 
cell ALL and identified rearrangements of the T cell 
receptor alpha/delta locus. DNA probes from this llq23 
breakpoint failed to show rearrangements in leukemias 
with the common llq23 translocations. Mapping data 

10 indicate that this breakpoint is approximately 700 
kilobases telomeric to MLL. Therefore, band llq23 
contains breakpoints for at least three different cancer- 
related translocations. However, the data presented 
herein establish a tight clustering of breakpoints in the 

15 MLL gene which is centromeric to RCK and the other 
t(ll;14) breakpoints previously described by the 
inventors . 



In reciprocal translocations, the identification of 

2 0 the derivative chromosome containing the critical 

junction is essential. Based on data from Southern blot 
analysis, FISH, and cytogenetic analysis of complex 
translocations, the inventors propose that the der(ll) 
contains the critical junction. At the molecular level, 
25 the Southern blot analyses show a consistent pattern that 
indicates that the 5' portion of the exon sequences 
centromeric to the breakpoint on the der(ll) are always 
conserved. In those cases in which, the 0.7 kilobase cDNA 
fragment identifies one rearranged band, it is always 

3 0 detected by only the centromeric PCR probe. Thus, exon 

sequences from the centromeric portion of the 8 . 3 
kilobase BamKX genomic fragment are always preserved on 
the der(ll) but the exon sequences from the telomeric 
portion of this genomic fragment can be deleted in the 
35 formation of the translocation. 

Previously, the inventors identified a patient with 
a t(9;ll) who was found to have a deletion by FISH of a 
series of probes spanning several hundred kilobases 
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telomeric to the breakpoint on llq23 (Rowley et al . , 
1990) . On Southern blot analysis of this patient's DNA, 
only one rearranged band was identified and thus the exon 
telomeric to the breakpoint was deleted. Recently, using 
5 FISH, the present inventors also found that a phage clone 
containing a large portion of the 14 kilobase genomic 
BamHT fragment immediately telomeric to the 8.3 kilobase 
breakpoint cluster region was also deleted in this 
patient. This 14 kilobase genomic BamHX fragment 

10 contains an open reading frame of MLL. Presumably, all 
of the coding sequences distal to the breakpoint are 
deleted in this patient. In addition, another patient 
with a t(6;ll) was also found to have one rearranged band 
on Southern analysis and a deletion of this same phage 

15 clone by FISH. Thus in several patients, deletions begin 
within the breakpoint cluster region and extend distally 
to include the region containing coding sequences of the 
gene. 

20 The molecular and FISH data indicating that the 

der(ll) chromosome contains the critical junction are 
supported by an analysis of complex translocations that 
involve three chromosomes. For example, in a 
t (4 ; 11; 17) (q21;q23 ;qll) , the movement of the 4q to llq 

25 {the der(ll)} is conserved whereas the llq is 

translocated to the derivative 17 chromosome. An 
analogous pattern has been identified in 13 cases of 
complex translocations. Based on the data of the present 
invention, the^ following model is proposed. As a result 

30 of the translocation, sequences on the der(ll) are joined 
to a large number of other chromosomal breakpoint 
regions, 19 detected in the inventors' laboratories 
alone. Presumably, the 5' sequences of the MLL gene are 
thus juxtaposed to 3' sequences from genes located on the 

3 5 other translocation partners. The present invention 
provides the molecular tools to allow the functional 
consequences of these translocations to be determined. 
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The present inventors have delineated a breakpoint 
cluster region in the MLL gene and have identified 
rearrangements in a total of 19 different translocations, 
insertions, and inversions involving llq23. The 0,7 
5 kilobase cDNA probe of the present invention, and its 
derivative centromeric and telomeric PCR probes, are 
proposed to be broadly applicable to clinical diagnosis, 
particularly as they detect all of the rearrangements in 
DNA digested with a single enzyme (BamHl) . This is 

10 envisioned to be useful in the rapid detection of 
leukemia in both children and adults and will be 
especially important in leukemic infants under one year 
of age in whom the single most common chromosomal 
abnormality is a translocation involving llq23. In 

15 addition, it is contemplated that this probe will be 

effective for monitoring response to chemotherapy and for 
evaluation of minimal residual disease following 
treatment. These probes will be essential in cloning the 
breakpoints of leukemias which involve the MLL locus and 

20 in further molecular analysis of these translocations. 



EXAMPLE III 

Sequencing of the 8.3 kilobase Genomic BamHl Fragment 

2 5 that 

Contains All of the Common MLL Translocation Breakpoints. 

The inventors have recently obtained the DNA 
sequence for the 8.3 kb genomic BamHl fragment which 

3 0 contains all of the common translocation breakpoints. 

This sequence is provided in the present application as 
seq id no: 6. 

The inventors envision using this new sequence 
35 information to map the intron-exon boundaries within this 
region and to identify the specific nucleotides involved 
in the breakpoint junctions in various patients. 
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EXAMPLE IV 

Expression of MLL-Derived Proteins and Ant i -MLL 
Antibodies 

5 1. Production of Antisera to a Region of MXjL Telomeric 
to the Breakpoint Region (MLL Amino Acids of Sea Id 
No; 8) 

To express MLL amino acids of seq id no: 8 
10 (corresponding to MLL amino acids 2772-3209 of Tkachuk et 
al . , 1992), plasmid 14-7 was digested with EcoRl and the 
insert was ligated into plasmid pGEX-KG digested with 
EccRl, resulting in the 1.3 kb MLL fragment inserted in 
frame into the expression vector. This construct 
15 produces an MLL amino acid-containing fusion protein with 
GST (glutathione-S-transf erase) . This DNA was 
transformed into JM101 bacteria. To produce large 
quantities of the MLL protein corresponding to seq .id 
no: 8 for production of rabbit antisera, the plasmid- 

2 0 transformed bacteria were grown in LB medium and induced 

to express the fusion protein with IPTG. 

This fusion protein was purified using glutathione- 
agarose affinity chromatography, followed by preparative 
25 SDS-polyacrylamide gel electrophoresis. The fusion* 

protein was then electroeluted from the gel and used to 
immunize rabbits in order to generate specific antisera 
(performed by Josman Laboratories, Napa, CA) . The rabbit 
antisera produced against the MLL protein corresponding 

3 0 to seq id no: 8 has a very high titer by western blotting 

and reacts specifically with the MLL portion of the 
fusion protein (Fig, 10) . 

2 . Production of Antisera to a Region of MLL 
3 5 Centromeric to the Breakpoint Region (MLL Amino 

Acids 323-623 from Seq Id No; 7) 

Specific MLL oligonucleotides with Smal restriction 
enzyme sites were used as PGR primers to amplify MLL 
40 amino acids 323-623 from seq id no: 7 using the plasmid 

14P18B as template. This amplified DNA was digested with 
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Smal and ligated into plasmid pGEX-KT (an improved 
version of plasmid pGEX-KG used above) that had been 
digested with Smal. This results in MLL amino acids 323- 
623 (representing MLL amino acids 1101-1400 of Tkachuk et 
5 al., 1992), corresponding to the proline-rich region, 

being inserted in-frame into the expression vector* This 
DNA was transformed into BL21 bacteria. Large amounts of 
this fusion protein can be produced using this 
methodology and employed in the production of specific 
10 antisera, for example, using rabbits. 



Such antibodies may be employed as part of the 
ongoing studies directed to the MLL protein. For 
example, they may employed to determine the MLL protein 
localization within the cell, or to determine whether 
this protein binds to DNA. The generation of monoclonal 
antibodies has also been made possible by the present 
invention. 

EXAMPLE V 
Expression of Various MLL Domains 

The MLL zinc finger regions (corresponding to amino 
acids 1350-1700, 1700-2000, and 1350-2000 of Tkachuk et 
al . , 1992) have been cloned into the pGEX-KT expression 
vector as described above. In addition, the inventors 
propose to clone various of the MLL protein coding 
regions into the expression vector pSg24 in pieces 
ranging from 3 00-650 amino acids to allow the functional 
definition of the MLL protein. 
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EXAMPLE VI 

Detection of MLL Gene Rearrangements in Karpas 45 
Leukemic Cells with a t(X:ll) (ql3;a23) Translocation 

5 This example concerns the detection and 

characterization of aberrant MLL transcripts in Karpas 4 5 
leukemic cells with a t(X;ll) (ql3;q23) translocation and 
provides further evidence of the utility of the present 
probes in detecting leukemic cells with different 
10 breakpoints. 

In this analysis of the Karpas 45 cell line (Karpas 
et al., 1977), known to have a t(X;ll) (ql3;q23) 
translocation (Kearney et al . , 1992), the inventors show 
15 the MLL gene to be rearranged and demonstrate the 

presence of two altered MLL transcripts which come from 
the der(ll) chromosome. MLL was also found to be 
rearranged using Southern blot analyses of DNA from 
Karpas 45. 

20 

1. Materials and Methods 

The T-cell line Karpas 45, established from a 
patient with a T-cell ALL, was obtained from A. Karpas 

25 (University of Cambridge, England, Karpas et al . , 1977), 
Karpas 4 5 has been shown, by fluorescence in situ 
hybridization, to have a t(X,ll (ql3;q23), which involves 
rearrangement of the MLL gene. The cell lines RC-K8 and 
RCH-ADD, which do not have chromosomal translocations 

3 0 that involve MLL have been described previously (Ziemin- 
van Der Poel et al . , 1991) and were used as controls. 

The cDNA probe 14P-18B has been described herein in 
the previous examples. The cDNA clone was digested with 

3 5 EcoRl and BamHl to give three fragments for use in 

Northern and Southern blot hybridizations. The 0.7B 
probe, which spans the breakpoint, and the 1.5EB probe / 
centromeric to the breakpoint, have been described 
hereinabove. A further 0.8 kb £coRl fragment, which is 

4 0 telomeric to the breakpoint was obtained and used in this 



WO 93/25713 0 % PCT/US93/05857 

-74- 

study, this probe is termed 0.8E. It should be noted 
that the EcoRl site used to excise the 1.5EB fragment was 
a cloning site. 

5 DNA was extracted from the Karpas 4 5 cell line and 

normal human placenta, digested with the restriction 
enzyme BamKl. and electrophoresed on a 1% agarose gel. 
Poly A + RNA was isolated from the cell lines Karpas 45, 
RC-K8 and RCH-ADD using the Fast Track Isolation Kit 
10 (Invitrogen) and 5 fig were electrophoresed on a 0.8% 

formaldehyde gel as described hereinabove. Radioactive 
labeling of cDNA fragments, hybridization and washing 
conditions were as described in the previous examples. 

15 2. Results and Discussion 

To determine if MLL was rearranged in the Karpas 45 
cell, known to have an llg2 3 translocation, a Southern 
blot with BamYLl digested DNA was hybridized to the 0.7B 
2 0 probe. Figure 11 shows that the MLL gene was rearranged 
in this llq2 3 translocation and that two rearranged 
fragments are evident, indicating the detection of 
sequences from both derivative chromosomes X and 11. 

2 5 To determine the nature of the MLL transcripts in 

this cell line, a Northern blot was hybridized 
sequentially to three different fragments of the 14P-18B 
cDNA clone. The fragments used were 0.8E (telomeric to 
the breakpoint), a 0.7B fragment (which spans the 

3 0 breakpoint) and finally a 1.5EB fragment (which is 

centromeric to the breakpoint), as shown in Fig. 2. All 
three fragments were found to show weak hybridization to 
the two normal sized MLL transcripts in all the cell 
lines (Fig. 12) . 

35 

The 0.7B and the 1.5EB fragments detected two 
additional transcripts, an abundant 8.0 kb transcript and 
a diffuse band around 6.0 kb in the Karpas 4 5 cell line, 
which were not present in the control cell lines (Fig. 
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12) . Furthermore, these two transcripts were not 
detected by the more telomeric 0.8E fragment (Fig. 12), 
Hybridization to actin indicated that there was 
approximately 50% less RNA in the Karpas 4 5 cell line 
5 lane compared to RNA in the control cell line (Fig. 12) . 

It should be noted here that the two normal sized 
MLL transcripts, listed as being of about 15 and 13 
kilobases, are the same transcripts previously referred 

10 to as about 12 and about 11.5 kb throughout the earlier 
examples. This illustrates the fact that the studies 
shown in Fig. 12 were conducted at a later date and that, 
as mentioned before, the earlier Northern blot size 
determinations were generally approximations, as is well 

15 known to result from using this method to determine sizes 
of greater than about 9 or 10 kb. However, this study of 
the Karpas cell line further exemplifies the utility of 
the probes in differentiating between normal and leukemic 
cells. 

20 

The present study further supports the inventors' 
findings that the breakpoint cluster region in the MLL 
gene occurs within a 9.0 kilobase Ea/nHl genomic fragment. 
On Northern analysis all three of the cDNA fragments 

2 5 detected the normal-sized MLL transcripts in the control 

cell lines, and to a lesser extent in the Karpas 45 cell 
line. However, the 0.7B and the 1.5EB fragments, which 
span and are centromeric to the breakpoint junction 
respectively, detected two additional altered transcripts 

3 0 of the MLL gene in the Karpas 45 cell line. As the more 

telomeric 0.8E fragment did not hybridize to these two 
novel transcripts, it may concluded that these 
transcripts are altered MLL transcripts coming from the 
derivative 11 chromosome. 

35 

Evidence of any altered MLL transcripts derived from 
the reciprocal chromosome X was not found in the Karpas 
4 5 cell line. This is in keeping with the inventors' 
proposition that the derivative 11 chromosome contains 
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the critical junction in two and three way reciprocal 
translocations involving chromosome band llq2 3 and the 
associated rearrangement of the MLL gene. 



* * * 



10 While the compositions and methods of this invention 

have been described in terms of preferred embodiments, it 
will be apparent to those of skill in the art that 
variations may be applied to the composition, methods and 
in the steps or in the sequence of steps of the method 

15 described herein without departing from the concept, 

spirit and scope of the invention. More specifically, it 
will be apparent that certain agents which are both 
chemically and physiologically related may be substituted 
for the agents described herein while the same or similar 

20 results would be achieved. All such similar substitutes 
and modifications apparent to those skilled in the art 
are deemed to be within the spirit, scope and concept of 
the invention as defined by the appended claims. All 
claimed matter and methods can be made and executed 

25 without undue experimentation. 
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CLAIMS 



1. A method for detecting leukemic cells containing 
5 llq2 3 chromosome translocations, comprising: 

(a) obtaining genomic DNA from cells suspected 
of containing a leukemia-associated 
chromosomal rearrangement at chromosome 

10 llq23; 

(b) digesting said DNA with one or more 
restriction enzymes; and 

15 (c) probing said digested DNA with a nucleic 

acid probe which includes a sequence in 
accordance with the sequence of a 0,7 kb 
BamRl fragment of cDNA clone 14P-18B. 



20 



2. The method of claim 1, wherein said DNA is digested 
with the single restriction enzyme BamHl. 



2 5 3, The method of claim 1, wherein the nucleic acid 

probe is the nucleic acid probe termed MLL 0.7B (seq id 
no: 1) * 



3 0 4. The method of claim 1, wherein the cells are 

obtained from a patient suspected of having a leukemia 
associated with a chromosomal rearrangement at chromosome 
llq23. 



35 
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5. A method for identifying an individual having a 
leukemia asspciated with an llq23 chromosome 
translocation, comprising digesting a genomic DNA sample 
obtained from said individual with the restriction enzyme 
5 BamUl and probing the digested DNA with a 0.7 kb BamHl 
restriction fragment obtained from MLL DNA, wherein said 
0,7 kb fragment encompasses the breakpoints clustered in 
an 8.3 kb BamHl genomic region of the MLL gene. 



6. The method of claim 5, wherein the 0.7 kb fragment 
is the fragment termed MLL 0.7B (seq id no:l). 

15 7. The method of claim 5, wherein the chromosome 11 

translocation in the 8.3 kb region of the MLL gene is a 
reciprocal translocation with chromosome 4, chromosome 6, 
chromosome 9, chromosome 19 or the X chromosome. 



10 



20 



8. A method for detecting leukemic cells containing 
llq23 chromosome translocations, comprising: 



25 



(a) 



obtaining mRNA from cells suspected of 
containing a leukemia-associated 
chromosomal rearrangement at chromosome 
llq2 3; and 



35 



30 



(b) 



probing said mRNA with a nucleic acid 
probe capable of identifying normal MLL 
gene transcripts and aberrant MLL gene 
transcripts, wherein a reduction in the 
amount of a normal MLL gene transcript or 
the presence of an aberrant MLL gene 
transcript is indicative of a cell 
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containing a llq2 3 chromosome 
translocation. 

5 9. The method of claim 8, wherein a reduction in the 
amount of a normal MLL gene transcript is characterized 
as a reduction in the amount of an MLL gene transcript of 
about 12.5 kb, about 12.0 kb or about 11.5 kb in length. 

10 

10. The method of claim 8, wherein the nucleic acid 
probe is fragment MLL 0.7B (seq id no:l), fragment MLL 
0.3BE (seq id 110:2), fragment MLL 1.5EB (seq id no:3) or 
the cDNA clone 14-7 (seq id no:5). 

15 

11. The method of claim 8, wherein the nucleic acid 
probe is f luorescently labelled. 

20 

12. The method of claim 8, wherein the cells are 
obtained from a patient suspected of having a leukemia 
associated with a chromosomal rearrangement at chromosome 
llq23 • 

25 

13. A DNA segment, free from total genomic DNA, having a 
sequence in accordance with, or Complementary to, the 
sequence of fragment MLL 0.7B (seq id no:l), fragment MLL 

30 0.3BE (seq id no:2), fragment MLL 1.5EB (seq id no:3), 

cDNA clone 14P-18B (seq id no: 4) or cDNA clone 14-7 (seq 
id no: 5) , derived from the MLL gene. 
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14. The DNA segment of claim 13, further defined as the 
fragment MLL 0.7B (seq id no:l). 



15. The DNA segment of claim 13, further defined as the 
fragment MLL 0.3BE (seq id no:2). 



16. The DNA segment of claim 13, further defined as the 
10 fragment MLL 1.5EB (seq id no: 3). 



17. The DNA segment of claim 13, further defined as the 
cDNA clone 14-7 (seq id no: 5). 

15 

18. A kit for use in the detection of leukemic cells 
containing llq23 chromosome translocations, comprising a 
first container which includes a nucleic acid probe which 

2 0 includes a sequence in accordance with the sequences of 
nucleic acid probes MLL 0.7B (seq id no:l), MLL 0.3BE 
(seq id no:2), MLL 1.5EB (seq id no:3) or 14-7 (seq id 
no: 5); and a second container which comprises a nucleic 
acid probe for use as a control. 

25 

19. The kit of claim 18, wherein the first container 
includes the nucleic acid probe MLL 0.7B (seq id no:l), 
MLL 0.3BE (seq id no:2), MLL 1.5EB (seq id no:3) or 14-7 

30 (seq id no: 5) . 



20. The kit of claim 19, wherein the first container 
includes the nucleic acid probes MLL 0.7B (seq id no:l), 
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MLL 0.3BE (seq id no:2), MLL 1.5EB (seq id no:3) and 14-7 
(seq id no: 5) . 

5 21. The kit of claim 18, further comprising a third 
container which includes a restriction enzyme. 

22. The kit of claim 21, wherein the first container 
10 includes the nucleic acid probe MLL 0.7B (seq id no:l) 
and the third container includes the restriction enzyme 
BamHl. 

15 23. The kit of claim 18, wherein the nucleic acid probe 
is f luorescently labelled. 

24. A protein including an MLL amino acid sequence 
20 purified relative to its natural state. 

25. The protein of claim 24, wherein the protein 
includes an MLL amino acid sequence telomeric to the 

25 breakpoint region. 

26. The protein of claim 25, wherein the protein 
includes an MLL amino acid sequence in accordance with 

3 0 seq id no: 8. 



35 



27. The protein of claim 24, wherein the protein 
includes an MLL amino acid sequence centromeric to the 
breakpoint region. 
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28. The protein of claim 27, wherein the protein 
includes an MLL amino acid sequence in accordance with 
amino acids 323-623 of seq id no: 7. 

5 

29. The protein of claim 27, wherein the protein 
includes a zinc finger region. 

10 

.30. An antibody having binding affinity for a protein 
including an MLL amino acid sequence. 

15 31. The antibody of claim 30, wherein the protein 

includes an MLL amino acid sequence centromeric to the 
breakpoint region, an MLL amino acid sequence telomeric 
to the breakpoint region or an MLL zinc finger region. 



20 
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DESCRIPTION 

COMPOSITIONS AND METHODS FOR DETECTING 
GENE REARRANGEMENTS AND TRANSLOCATIONS 

5 

BACKGROUND OF THE INVENTION 

This application is a continuation-in-part of 
copending application, USSN 07/991,224, filed December 

10 16, 1992, which was a continuation-in-part of USSN 

07/900,689, filed June 17, 1992. The entire text of each 
of the above-referenced disclosures is specifically 
incorporated by reference herein without disclaimer. 

The government owns rights in the present invention 

15 pursuant to grants CA42557, CA40046, CA38725, CA34775, 
5T32 CA09566 and 5T32 CA09273-12 from the National 
Institutes of Health and DE-FG02-86ER60408 from the 
Department of Energy. 

2 0 l. Field of the Invention 

The present invention relates generally to the 
diagnosis of cancer. The invention concerns the creation 
of probes for use in diagnosing and monitoring certain 

2 5 genetic abnormalities, including those found in leukemia 

and lymphoma, using molecular biological hybridization 
techniques. In particular, it concerns the localization 
of the translocation breakpoint on the MLL gene, the 
identification of nucleic acid probes capable of 

3 0 detecting rearrangements in all patients with the common 

llq2 3 translocations and the identification of MLL mRNA 
transcripts characteristic of leukemic cells. MLL fusion 
proteins and anti-MLL antibodies are also disclosed. 
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2 • Description of the Related Art 

The etiology of a substantial portion of human 
diseases lies, at least in part, with genetic factors. 
5 The identification and detection of genetic factors 
associated with particular diseases or malformations 
provides a means for diagnosis and for planning the most 
effective course of treatment* For some conditions, 
early detection may allow prevention or amelioration of 
10 the devastating courses of the particular disease. 

The genetic material of an organism is located 
within one or more microscopically visible entities 
termed chromosomes* In higher organisms, such as man, 

15 chromosomes contain the genetic material DNA and also 
contain various proteins and RNA. The study of 
chromosomes, termed cytogenetics, is often an important 
aspect of disease diagnosis. One class of genetic 
factors which lead to various disease states are 

2 0 chromosomal aberrations, i.e., deviations in the expected 
number and/or structure of chromosomes for a particular 
species or for certain cell types within a species. 

There are several classes of structural aberrations 

2 5 which may involve either the autosomal or sex 

chromosomes, or a combination of both. Such aberrations 
may be detected by noting changes in chromosome 
morphology, as evidenced by band patterns, in one or more 
chromosomes. Normal phenotypes may be associated with 

3 0 rearrangements if the amount of genetic material has not 

been altered, however, physical or mental anomalies 
result from chromosomal rearrangements where there has 
been a gain or loss of genetic material. Deletions, or 
deficiencies, refer to loss of part of a chromosome, 
3 5 whereas duplication refers to addition of material to 
chromosomes. Duplication and deficiency of genetic 
material can be produced by breakage of chromosomes, by 
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errors during DNA synthesis, or as a consequence of 
segregation of other rearrangements into gametes. 

Translocations are interchromosomal rearrangements 
5 effected by breakage and transfer of part of chromosomes 
to different locations. In reciprocal translocations, 
pieces of chromosomes are exchanged between two or more 
chromosomes. Generally, the exchanges of interest are 
between non-homologous chromosomes. If all the original 
10 genetic material appears to be preserved, this condition 
is referred to as balanced. Unbalanced forms have 
duplications or deficiencies of genetic material 
associated with the exchange; that is, some material has 
been gained or lost in the process. 

15 

One of the most interesting associations between 
chromosomal aberrations and human disease is that between 
chromosomal aberrations and cancer. Non-random 
translocations involving chromosome 11 band q23 occur 

20 frequently in both myeloid and lymphoblastic leukemias 
(Rowley, 1990b; Heim & Mitelman, 1987) . The four most 
common reciprocal translocations are t(4;ll) and 
t(ll;19), which exhibit mainly lymphoblastic markers and 
sometimes monocytic markers, or both lymphoblastic and 

25 monoblastic markers; and t(6;ll) and t(9;ll), which are 

mainly found in monoblastic and/or myeloblastic leukemias 
(Mitelman et al . , 1991). Other chromosomes which are 
involved in recurring translocations with this band in 
acute leukemias are chromosomes X, 1, 2, 10, and 17. 

30 

The present inventors have previously demonstrated, 
by fluorescence in situ hybridization (FISH) , that a 
yeast artificial chromosor.e ( YAC) containing the CD 3D and 
CD3G genes was split in cells with the four most common 
35 translocations (Rowley et al . , 1990). Further studies 
led the inventors to the identification of the gene 
located at the breakpoint, which was named MLL for mixed 
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lineage leukemia or myeloid/ lymphoid leukemia (Ziemin-van 
Der Poel et al . , 1991). The MLL gene has also been 
independently termed ALL-2 (Cimino et al . , 1991; Gu et 
al., 1992a; b) , iftrx (Djabali et al . , 1992) and HRX 
5 (Tkachuk et al . , 1992). The present inventors 

differentiated the more centromeric MLL rearrangements 
from the more telomeric breakpoint translocations which 
involve the RCK locus (Akao et al . , 1991b) or the p54 
gene (Lu & Yunis, 1992) . 

10 

From the same YAC clone as described by the present 
inventors (Rowley et al . , 1990), a DNA fragment was 
obtained which allowed the detection of rearrangements in 
leukemic cells from certain patients (Cimino et al . , 

15 1991; 1992). This 0.7 kilobase Ddel fragment allowed 

detection of rearrangements in a 5.8 kilobase region in 6 
of 7 patients with the t(4;ll), 4 of 5 with t(9;ll), and 
3 of 4 with the t(ll;19) translocations (Cimino et al . , 
1992) . Combining these results with those from a 

2 0 subsequent series including an additional 14 patients, 
the Ddel fragment probe was found to detect 
rearrangements in 2 6 of 3 0 cases with t(4;ll) , t(9;ll) 
and t(ll;19) translocations (Cimino et al . , 1991; 1992), 
which represents an overall detection rate of 87%. 

2 5 Despite this partial success, the failure of the Ddel 

probe to detect all rearrangements is a significant 
drawback to its use in clinical diagnosis. 

Accordingly, prior to the present invention, there 

3 0 remained a particular need for the identification of 

nucleic acid fragments or probes capable of detecting 
leukemic cells from all patients with the common llq23 
translocations. The creation of such probes which may be 
used in both Southern blot analyses and in FISH with 
35 either dividing leukemic cells or interphase nuclei would 
be particularly important. The elucidation of further 
information regarding the MLL gene, such as further 
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sequence data and information regarding transcription 
into mRNA, would also be advantageous, as would the 
identification of nucleic acid fragments capable of 
differentiating MLL mRNA transcripts from normal and 
5 leukemic cells. 



SUMMARY OF THE INVENTION 

10 The present invention seeks to overcome these and 

other drawbacks inherent in the prior art by providing 
improved compositions and methods for the diagnosis, and 
continued monitoring, of various types of leukemias, 
particularly myeloid and lymphoid leukemia, and lymphomas 

15 in humans. This invention particularly provides novel 
and improved probes for use in genetic analyses, for 
example, in Southern and Northern blotting and in 
fluorescence in situ hybridization (FISH) using either 
dividing leukemic cells or interphase nuclei. 

20 

The inventors first localized the translocation 
breakpoint on the MLL gene to within an estimated 9 kb 
BamHI genomic region of the MLL gene, and later sequenced 
this region and found it to be 8.3 kb in size. They have 

25 further identified short nucleic acid probes, as 

exemplified by a breakpoint-spanning 0.7 kb BamHl cDNA 
fragment, which detect rearrangements on Southern blot 
analysis of singly-digested DNA in all patients with the 
common llq23 translocations, namely t(4;ll), t(6;ll), 

30 t(9;ll), and t(ll;19), and also in certain patients with 
other rare llq2 3 anomalies. The use of this novel 
nucleic acid probe represents a significant advantage 
over previously described probes which allowed the 
molecular diagnosis of leukemia only in certain cases of 

3 5 common llq23 translocations, and not in all cases. 
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The invention also provides probe compositions for 
use in Northern blot analyses and methods for identifying 
leukemic cells from the pattern of MLL mRNA transcripts 
present, which are herein shown to be different in 
5 leukemic cells as opposed to normal cells. 

The present invention generally concerns the 
breakpoint-spanning gene named MLL, and this terra is used 
throughout the present text. MLL is the accepted 

10 designation for this gene adopted by the human genome 

nomenclature committee (Chromosome Co-ordinating Meeting, 
1992) , however, other terms are also in current use to 
describe the same gene. For example, the terms ALL-1 
(Cimino et al . , 1991, Gu et al . , 1992a; b) , Htrx (Djabali 

15 et al . , 1992) and HRX (Tkachuk et al . , 1992) are also 

currently employed as names for the MLL gene. As these 
terms in fact refer to the same gene, i.e., to the MLL 
gene/ each of the foregoing ALL-1, Htrx and HRX x genes' 
are encompassed by the present invention and are 

20 described herein, for simplicity, by the single term 
"MLL". 

In certain embodiments, the invention concerns a 
method for detecting leukemic cells containing llq23 

2 5 chromosome translocations that involve MLL, which method 

comprises obtaining nucleic acids from cells suspected of 
containing a leukemia-associated chromosomal 
rearrangement at chromosome llq2 3, and probing said 
nucleic acids with a probe capable of differentiating 

3 0 between the nucleic acids from normal cells and the 

nucleic acids from leukemic cells. To "differentiate 
between the nucleic acids from normal cells and the 
nucleic acids from leukemic cells 11 will generally require 
using a probe, such as those disclosed herein, which 
3 5 allows MLL DNA or RNA from normal cells to be identified 
and differentiated from MLL DNA or RNA from leukemic 
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cells by criteria such as, e.g., number, pattern, size or 
location of the MLL nucleic acids. 

The cells suspected of containing a chromosomal 
5 rearrangement at chromosome llq2 3 may be cells from cell 
lines or otherwise transformed or cultured cells. 
Alternatively, they may be cells obtained from an 
individual suspected of having a leukemia associated with 
an llq23 chromosome translocation, or cells from a 
10 patient known to be presently or previously suffering 
from such a disorder. 



The nucleic acids obtained for analysis may be DNA, 
and preferably, genomic DNA, which may be digested with 

15 one or more restriction enzymes and probed with a nucleic 
acid probe capable of detecting DNA rearrangements from 
leukemic cells containing llq2 3 chromosome 
translocations. Techniques such as these are based upon 
x Southern blotting' and are well known in the art (for 

20 example, see Sambrook et al . (1989), incorporated herein 
by reference) . A large battery of restriction enzymes 
are commercially available and the conditions for 
Southern blotting are described hereinbelow, suitable - 
modifications of which will be known to those skilled in 

25 the art of molecular biology. 

Preferred nucleic acid probes for use in Southern 
blotting to detect leukemic cells containing llq2 3 
chromosome translocations are those probes which include 

30 a sequence in accordance with the sequence of a 0.7 kb 

BamEl fragment of the CDNA clone 14P-18B derived from the 
MLL gene, and more preferably, will be the probe MLL 0.7B 
(seq id no:l) itself. The use of this probe is 
particularly advantageous as this fragment encompasses 

35 the breakpoints clustered in the 8.3 kb BamUl genomic 
region (seq id no: 6) of the MLL gene and allows the 
detection of all the common llq23 translocations. 
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Moreover, using MLL 0.7B (also simply referred to as 
0.7B) presents the added advantage that DNA may be 
digested with only a single restriction enzyme, namely 
BamHl. Probe MLL 0.7B (seq id no:l) is derived from a 
5 cDNA clone that lacks Exon 8 sequences, but this clearly 
has no adverse effects on breakpoint detection using this 
probe. 

Patients' or cultured cells may also be analyzed for 

10 the presence of llq23 chromosome translocations by 

obtaining RNA, and preferably, mRNA, from the cells and 
probing the RNA with a nucleic acid probe capable of 
differentiating between the MLL mRNA species in normal 
and leukemic cells. This differentiation will generally 

15 involve using a probe capable of identifying normal MLL 
gene transcripts and aberrant MLL gene transcripts, 
wherein a reduction in the amount of a normal MLL gene 
transcript, such as those estimated to be about 12.5 kb, 
12.0 kb or 11.5 kb in length, or the presence of an 

20 aberrant MLL gene transcript, not detectable in normal 
cells, will be indicative of a cell containing a llq23 
chromosome translocation. Techniques of detecting and 
characterizing mRNA transcripts, based upon Northern 
blotting, are described herein and suitable modifications 

25 will be known to those of skill in the art (e.g., see 
Sambrook et al . , 1989). 

It is important to note that throughout this text 
the size of certain transcripts quoted are estimated 

3 0 measurements from Northern blot analyses. It is well 
known in the art that agarose gel resolution of RNA 
species of about 9 to 10 kb in size, or greater, leads to 
an approximate size determination, especially with sizes 
of greater than about 10 kb. Hence, size determinations 

3 5 made initially by this technique may later be found to be 
over- or under-estimates of the true size of a given 
transcript. For example, the MLL translocation 
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breakpoint was first localized to an estimated 9 kb BamHZ 
genomic region which the inventors later found, by- 
sequencing , to be 8.3 kb in size. It is possible that 
the estimated sizes of the larger mRNA transcripts may 
5 differ as much as about 2 kb up to about 3 kb from their 
size determined by sequencing, and that the 12.5 kb to 
11 kb size range may be more accurately represented by a 
15 kb to 13 kb size range. This general phenomenon has 
been observed before in regard to the MLL gene itself 
10 (e.g., Cimino et al . , 1991; 1992). 

Using the probes of this invention, a reduction in 
the amount of MLL gene transcripts estimated to be of 
about 12.5 kb, 12.0 or 11.5 kb in length (or about 15-13 

15 kb) , as compared to the level of such transcripts in 
normal cells, is indicative of cells which contain a 
llq23 chromosome translocation. The size of aberrant Mil, 
transcripts will naturally vary between the individual 
cell lines and patients' cells examined, but will 

2 0 nevertheless always be distinguishable from the size and 
pattern of MLL transcripts identified by the same 
probe (s) in normal cells. 

In RS4;11 cells, the specific rearranged mRNA 
25 transcripts identified as characteristic of leukemic 

cells are estimated to be of about 11.5 kb, 11.25 kb or 
11.0 kb in length, and so an elevation in the levels of 
such transcripts is indicative of a cell containing an 
llq2 3 chromosome translocation. In the Karpas 4 5 cell 
30 line (K45 t (X; 11) (ql3 ;q23 ) ) , the aberrant mRNA 

transcripts have estimated sizes of about 8 kb and about 
6 kb, which are therefore another example of transcripts 
characteristic of leukemic cells. In any event, it will 
be clear that using the probes of the present invention 
35 one may differentiate between normal and leukemic cell 

transcripts, and thus identify leukemic cells in an assay 
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or screening protocol, regardless of the actual size and 
pattern of the aberrant transcripts themselves. 

Probes preferred for use in analyzing mRNA 
5 transcripts in order to identify cells with an llq2 3 
chromosome translocation, i.e., for use in Northern 
blotting detection, are contemplated to be those based 
upon the cDNA clones 14P-18B (seq id no: 4) and 14-7 (seq 
id no: 5). In such Northern blotting detection, the use 

10 of cDNA clone 14-7 itself (seq id no: 5) and various 

fragments of clone 14P-18B (seq id no: 4) is contemplated. 
The use of 14P-18B fragments in Northern blotting is 
generally preferred, with the nucleic acid fragments 
termed MLL 0.7B (0.7B, seq id no:l), MLL 0.3BE (0.3BE, 

15 seq id no: 2) and MLL 1.5EB (1.5BE, seq id no: 3) being 
particularly preferred. 

The use of a combination of the probes described 
above may provide further advantages in certain cases as 

2 0 it may allow the differentiation of further distinct MLL 

gene transcripts. An example of this is presented herein 
in the case of the RS4;11 cell line. Here, it is 
demonstrated herein that normal cells contain an MLL gene 
transcript of estimated length 11.5 kb and that RS4;11 
25 leukemic cells have a reduced amount of this normal 

transcript (in common with their reduced amount of the 
12.5 kb and 12.0 kb normal transcripts). However, the 
inventors have also determined that the RS4;11 leukemic 
cells contain an aberrant mRNA transcript, also estimated 

3 0 to be about 11.5 kb in length, which is present in 

significant quantities and may even be termed over- 
expressed (a specific increase in the level of an mRNA 
transcript in comparison to the level in normal cells is 
indicative of "over-expression") . 

35 

The probe termed 1.5EB (seq id no: 3) is herein shown 
to detect the normal 11.5 kb transcript, and a weak 
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signal in a Northern blot employing this probe is 
therefore indicative of a leukemic cell containing an 
llq2 3 chromosome translocation. Each of the more 
telomeric probes, namely 0.7B, 0.3BE and 14-7, (seq id 
5 nos:l, 2, and 5, respectively) are shown to detect the 
over-expressed, aberrant, 11.5 kb transcript in RS4;11 
cells, and a strong signal in a Northern blot employing 
any of these probes therefore characterizes a leukemic 
cell with an RS4;ll-like translocation. A further 

10 advantage of the present invention is, therefore, that in 
using more than one probe, it provides methods by which 
to differentiate between normal and aberrant transcripts 
which may be similar in size, and thus increases the 
number of factors with which to differentiate between 

15 leukemic and normal cells. 

The probes of the present invention may also be used 
to identify leukemic cells containing llq23 chromosome 
translocations in situ, that is, without extraction of 
20 the genetic material. Fluorescent in situ hybridization 
(FISH) , which allows cell nuclei to be analyzed directly, 
is one method which is considered to be particularly 
suitable for use in accordance with the present - 
invention. Cells may be analyzed in metaphase, a stage 

2 5 in cell division wherein the chromosomes are individually 

distinguishable due to contraction. However, the methods 
and compositions of the present invention are 
particularly advantageous in that they are equally 
suitable for use with interphase cells, a stage wherein 

3 0 chromosomes are so elongated that they are entwined and 

cannot be individually distinguished. 

Cloned DNA probes from both sides of the 
translocation breakpoint region can be used with FISH to 
3 5 detect the translocation in leukemic cells. In normal 

cells, these two probes would be together and they would 
appear as a single signal. In cells with a 
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trans location , the centromeric probe would remain on the 
derivative 11 chromosome whereas the telomeric probe 
would be translated to the other derivative chromosome. 
This would result in two smaller signals, one on each 
5 translocation partner. As the inventors have shown that 
about 3 0% of patients have a deletion of the MLL gene 
immediately telomeric to the breakpoint, they have cloned 
a series of telomeric probes that can be used reliably to 
detect the translocation in virtually all patients. 

10 

Whether employing Southern, blotting, Northern 
blotting, FISH, or any other amenable techniques, the 
present invention provides improved methods for analyzing 
cells from patients suspected of having a leukemia 
15 associated with an llq2 3 chromosome translocation. In 
that the probes disclosed herein are able to detect DNA 
rearrangements in all patients with the common llq2 3 
translocations, i.e., there are no false-negatives, their 
use represents a significant advance in the art. 

20 

This invention will be particularly useful in the 
analysis of individuals who have already had one 
malignant disease that has been treated with certain 
drugs that induce leukemia with llq2 3 translocations in 

25 10 to 25% of patients (Ratain & Rowley, 1992) . Thus 

cells from these patients can be monitored with Southern 
blot analysis, PCR and FISH to detect cells with an llq23 
translocation and thus identify patients very early in 
the course of their disease. In addition, the probes 

3 0 described in this invention can be used to monitor the 

response to therapy of leukemia patients known to have an 
llq2 3 translocation. These leukemic cells show a 
substantial decrease in frequency in response to therapy. 

3 5 In further embodiments, the present invention 

concerns compositions comprising nucleic acid segments, 
and particularly DNA segments, isolated free from total 
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genomic DNA, which have a sequence in accordance with, or 
complementary to, the sequence of cDNA clone 14P-18B (seq 
id no: 4) or cDNA clone 14-7 (seq id no: 5) derived from 
the MLL gene. Such DNA segments are exemplified by the 
5 clones 14P-18B (seq id no: 4) and 14-7 (seq id no: 5) 
themselves, and also by various fragments of such 
. sequences. cDNA clones 14P-18B and 14-7 may be 

characterized as being derived from the MLL gene/ as 
being about 4.1 kb and about 1.3 kb in length, 
10 respectively, and as having restriction patterns as 
indicated in Figure 1 and Figure 2 . 

The invention provides probes which span the MLL 
breakpoint, e.g., 0.7B; probes centromeric to the 

15 breakpoint, e.g., 1.5EB, and probes telomeric to the 
breakpoint, e.g., 0.3BE, 14-7, and even 0.8E. ^ 
Particularly preferred DNA segments of the present ' 
invention are those DNA segments represented by the- 
nucleic acid fragments, or probes, termed MLL 0.7B (0.7B, 

2 0 seq id no:l), MLL 0.3BE (0.3BE, seq id no: 2) and MLL 
1.5EB (1.5BE, seq id no: 3). 

The nucleic acid segments and probes of the present 
invention are contemplated for use in detecting cells, 

25 and particularly, cells from human subjects, which 

contain an llq23 chromosome translocation. However, they 
are not limited to such uses and also have utility in a 
variety of other embodiments, for example, as probes or 
primers in nucleic acid hybridization embodiments. The 

30 ability of these nucleic acid segments to specifically 

hybridize to MLL gene-like sequences will enable them to 
be of use in various assays to detect complementary 
sequences, other than for diagnostic purposes. The use 
of such nucleic acid segments as primers for the cloning 

35 of further portions of genomic DNA, or for the 

preparation of mutant species primers, is particularly 
contemplated. The DNA segments of the invention may also 
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be employed in recombinant expression. For example, as 
disclosed herein, they have be used in the production of 
peptides or proteins for further analysis or for antibody 
generation. 

5 

The present invention also embodies kits for use in 
the detection of leukemic cells containing llq23 
chromosome translocations. Kits for use in both Southern 
and Northern blotting and in FISH protocols are 

10 contemplated, and such kits will generally comprise a 

first container which includes one or more nucleic acid 
probes which include a sequence in accordance with the 
sequences of nucleic acid probes MLL 0.7B (seq id no:l), 
MLL 0.3BE (seq id no:2), MLL 1.5EB (seq id no:3) or 14-7 

15 (seq id no: 5), and a second container which comprises one 
or more unrelated nucleic acid probes for use as a 
control. In preferred embodiments, such kits will 
include one or more of the nucleic acid probes termed MLL 
0.7B (seq id no:l), MLL 0.3BE (seq id no:2), MLL 1.5EB 

2 0 (seq id no: 3) or 14-7 (seq id no: 5) themselves, and kits 

for use in connection with FISH or Northern blotting 
will, most preferably, include all such nucleic acid 
probes or segments. 

25 Kits for the detection of leukemic cells containing 

llq23 chromosome translocations by Southern blotting may 
also include a third container which includes one or more 
restriction enzymes. Particularly preferred Southern 
blotting kits will be those which include the nucleic 

3 0 acid probe MLL 0.7B (seq id no:l) and the restriction 

enzyme SajnHl. Naturally, kits for use in connection with 
FISH will contain one or more nucleic acid probes which 
are f luorescently labelled. 

35 Further embodiments of the present invention concern 

MLL peptides, polypeptides, proteins, and fusions thereof 
and antibodies having binding affinity for such proteins, 
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peptides and fusions. The invention therefore concerns 
proteins or peptides which include an MLL amino acid 
sequence, purified relative to their natural state. Such 
proteins or peptides may contain only MLL sequences 
5 themselves or may contain MLL sequences linked to other 
protein sequences, such as, e.g., x natural ' sequences 
derived from other chromosomes or portions of 
x engineered' proteins such as glutathione-S-transf erase 
(GST) , ubiquitin, 6-galactosidase and the like. 

10 

Proteins prepared in accordance with the invention 
may include MLL amino acid sequences which are either 
telomeric or centromeric to the breakpoint region, as 
exemplified by the amino acid sequences of seq id no: 8 

15 and amino acids 323-623 of seq id no: 7, respectively. 

Other proteins which are contemplated to be particularly 
useful are those including a zinc finger region from seq 
id no: 7, such as those generally located between amino 
acids 574-1184, and more particularly, those including 

20 amino acids 574 to about 810 and about 1057 to 1184 of 

seq id no: 7. Antibodies prepared in accordance with the 
invention may be directed against any of the 
x centromeric 7 or * telomeric' proteins described herein, 
or portions thereof, with antibodies against the zinc 

25 finger regions of seq id no: 7 being particularly 
contemplated. 



BRIEF DESCRIPTION OF THE DRAWINGS 

30 

Figure 1. 

Alignment of cDNA clones of the MLL gene with genomic 
sequences. The top thick solid line represents the 
genomic sequence in which not all the restriction sites 
35 are indicated. The sizes above the line 14 kb, 8.3 kb 

and -20 kb refer to the BamHl fragments. The two dashed 
lines located above the 14 kb BamHI genomic fragment 
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indicate the 2 . lkb BamHX/Sstl telomeric fragment (14BS) , 
and the 0.8 kb PstI centromeric fragment (14P) used to 
screen the cDNA library. The solid line under each cDNA 
clone indicates the region of homology between clones. 
5 The predicted direction of transcription of MLL and the 
open reading frame of clone 14-7 is indicated by the 
arrow. Restriction enzymes used; B, BamHI; S, SstI; Sa, 
Sail; P, PstI; H, ifindlll; X, Xhol; E, EcoRTj Bg, Bgll. 

10 Figure 2. 

A map of cDNA clones 14-7 and 14P-18B. Restriction 
enzymes are the same as in Figure 1. The solid lines 
below the cDNA clones indicate the cDNA fragments used in 
the Southern and Northern hybridizations. All of clone 

15 14-7, and three adjacent fragments of 0.3 kb BajnHl/EcoRl 
{MLL 0.3BE), 0.7 kb BamHl (MLL 0.7B) and 1.5 kb 
EcoRl/ BamHl (MLL 1.5EB) from cDNA clone 14P-18B were 
used. Note that the EcoRl site used to excise the 1.5 kb 
fragment was a cloning EcoRl site. The breakpoint region 

2 0 within the 0.7 kb BamHl fragment is also shown, as is the 

0.8 kb JEcoRI probe (MLL 0.8E) employed in analyzing the 
Karaps 45 cell line. It will be noted that the 
orientation of the probes represented in this figure is 
reversed to that in sequence 14P-18B (seq id no: 4), where 
25 MLL 1.5EB is first, MLL 0.7B is next and MLL 0.3BE is 
last. 

Figure 3 . 

Southern blot of DNA from cell lines and patient leukemic 

3 0 cells with llq23 translocations digested with BamHl and 

hybridized to MLL 0.7B. Lanes 1, 7, control DNA; lane 
2, RS4;11 cell line; lanes 3-5, patients 1-3 (as detailed 
in Table 1), lane 6, Sup-T13 cell line showing weak 
hybridization to two rearranged bands of 7.0 kb and 
3 5 1.4 kb, lane 8, RC-K8 cell line. DNA fragment sizes in 
kilobases are shown on the left. 
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Figure 4. 

Northern blot analyses of poly (A) + RNA. Poly(A)+ RNA was 
isolated from cell lines in logarithmic growth phase 
except where noted. RNA sizes are indicated on the left. 
5 Figure 4 consists of Figure 4A and Figure 4B. 

Figure 4 A. Each lane 1 is the RCH-ADD cell line; each 
lane 2 is the RC-K8 cell line and each lane 3 is the 
RS4;11 cell line in stationary growth phase. The 
Northern blots in this panel were hybridized sequentially 
10 to the 14-7 probe, (a); the MLL 0.7B probe, (b) ; and the 
MLL 1.5EB probe, (c) . Hybridization to actin is also 
shown in this panel in (a) . 

Figure 4 B. RNA from the RS4;11 cell line. The Northern 
blots in this panel were hybridized in the same manner to 
15 the 14-7 probe, (a); the MLL 0.3BE probe, (b) ; the MLL 
0.7B probe, (c) ; and the MLL 1.5EB probe, (d) . 

Figure 5. 

Schematic representation of the Northern blot results 
2 0 obtained from the sequential hybridization of probes (14- 
7, MLL 0.3BE, MLL 0.7B and MLL 1.5EB) to control (C) and 
RS4;ll cell line (4; 11) RNA. Only the large size 
transcripts are shown. The solid lines indicate normal 
sized transcripts of normal mRNA with estimated sizes of 
25 12.5, 12.0 and 11,5 kb which are detected in both control 
and RS4;11 cell lines. The dashed lines represent the 
aberrant sized transcripts with estimated sizes of 11.5, 
11.2 5 and 11.0 kb detected in the RS4;11 cell line. In 
the RS4;11 cell line the normal and altered (estimated) 
30 11.5 kb mRNA transcripts are indicated by an overlapping 
broken and solid line. The line thickness indicates the 
strength of the hybridization signal. The chromosomal 
origin of each transcript is depicted on the right. 

35 Figure 6. 

Southern hybridization of patient DNA digested with BamRl 
and probed with the 0.7 kilobase BamHI cDNA fragment. 
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Sizes are in kilobases. Lane 1: Normal peripheral white 
blood cell DNA, Lane 2: AML with t (1; 11) (q21;q23) , Lane 
3: ALL with t (4 ; 11) (g2l;g23) , Lane 4: ALL with 
t(4;ll) (q21;q23) , Lane 5: ALL with t (4 ; 11) (q21;q23 ) , Lane 
5 6: ALL with t (4 ; 11) (q21;q23 ) , Lane 7: ALL with 

t(4;ll) (q21;q23) , Lane 8: AML with t (6 ; 11) (q27;q23) , 
Lane 9: AML with t ( 6 ; 11) (q27 ;q23 ) , Lane 10: AML with 
t (9;11) (p22;q23) , Lane 11: AML with t ( 10 ; 11) (pl3 ;q21) , 
Lane 12: Lymphoma with t(10;ll) (p!5;q22) , Lane 1^3: AML 
10 with ins(10;ll) (pll;q23q24) , Lane 14: AML with 
ins (10; 11) (pl3 ;q21q24) , Lane 15: ALL with 
t(ll;19) (q23;pl3.3) , Lane 16, AML with 
t(ll;19) (q23;pl3.3) , Lane 17: AML with 

t(ll;22) (q23;ql2). A single germline band was detected 
15 in normal DNA in lane 1 and in patient samples with non- 
llq2 3 breakpoints in lanes 11, 12, and 14. 
Rearrangements were detected in all other lanes. Lanes 
2, 3, 4, 6, 7, 8, 10, 13, 16, 17 had two rearranged 
bands, and lanes 5, 9, and 15 had one rearranged band. 

20 

Figure 7, 

Southern hybridization of leukemic and normal DNA 
digested with BamEl and probed with the 0.7 kilobase 
BamHl cDNA fragment and with the centromeric and 

2 5 telomeric PCR-derived probes. Sizes are in kilobases. 

Figure 7 consists of Figure 7A, Figure 7B and Figure 7C. 
Figure 7 A. DNA probed with 0.7 kilobase cDNA probe. 
Lane l: Biphenotypic leukemia with t ( 11; 19) (q23 ;pl3 . 3 ) , 
lane 2: ALL with t ( 11; 19) (q23 ;pl3 . 3 ) , lane 3: AML with 
30 t(ll;19) (q23;pl3.3) , lane 4: normal DNA, lane 5: AML 

with t (6;11) (q27;q23) , lane 6: Follicular lymphoma with 
t (6;11) (pl2 ;q23) . A single germline 8.3 kilobase band is 
identified in normal DNA in lane 5 and is also present in 
all other lanes. Two rearranged bands, corresponding to 

3 5 the two derivative chromosomes, are identified in lanes 

1, 2, and 3. A single rearranged band is present in 
lanes 5 and 6 . 
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Figure 7 B: The blot from panel A was stripped and 
rehybridized with the centromeric PCR probe. The 
germline 8.3 kilobase band is again present in all lanes. 
In lanes 1-3., one of the two rearranged bands is 
5 detected. In lane 3, the rearranged band is slightly 
larger than the germline band. In lanes 5 and 6, the 
single rearranged band is also identified. 
Figure 7 C: The blot from panel A was stripped and then 
rehybridized with the telomeric PCR probe. The germline 
10 band is present in all lanes. In lanes 1-3, one of the 
two rearranged bands is identified. In lane 2, the 
rearranged band is slightly smaller than the germline 
band. However, the single rearranged band in lanes 5 and 
6 is not detected. 

15 

Figure 8 . - 
Southern hybridization of patient DNA digested with BamHl 
and probed with 0.7 kilobase BamHI cDNA fragment and with 
the centromeric and telomeric PCR-derived probes. Lane 

20 1: AML with t(l;ll) (q21;q23) - same patient as in lane 2 
of Figure 7. Lane 2: ALL with t (4 ; 11) (q21;q23) - the 
same patient as shown in lane 6 of Figure 7. Figure 8 
consists of Figure 8A, Figure 8B and Figure 8C. 
Figure 8 A. DNA probed with the 0.7 kilobase cDNA probe. 

25 The germline band and two rearranged bands are present in 
both lanes. 

Figure 8 B. The blot from panel A was stripped and 
rehybridized with the centromeric PCR probe. The 
germline band and both rearranged bands are again 
3 0 detected. 

Figure 8 C. The blot from panel A was stripped and then 
rehybridized with the telomeric PCR probe. The germline 
band and only one of the rearranged bands are detected. 

3 5 Figure 9. Representation of the 8.3 kb BamHl Genomic 
Section of the MLL gene and Various cDNA Probes. 
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Figure 10. Reactivity of Specific anti-MLL Antisera 
Directed Against the MLL Amino Acids of Seq Id No: 8. 
Western blots of pre- immune sera (lanes 1, 7 & 8) and 
high titer rabbit antisera (lanes 2-6 , 9 & 19) specific 
5 for the MLL portion of the MLL-GST fusion protein. The 
creation of an expression vector for the production of an 
MLL amino acid-containing fusion protein containing MLL 
amino acids of seq id no: 8 and GST is described in 
Example IV. 

10 

Figure 11. Southern blot analysis of DNA from human 
placenta (C) and the Karpas 45 cell line (K45, 
t (X;ll) (q!3 ;q23) ) digested with BamHl and hybridized to 
the 0.7B cDNA fragment of MLL (seq id no:l). DNA size 
15 markers are shown on the left and the lines on the right 
denote the rearranged DNA bands detected in the Karpas 4 5 
cell line. 

Figure 12. Northern blot analysis of RNA isolated from 

2 0 two control cell lines RC-K8 (C) arid RCH-ADD (C) and the 

Karpas 45 cell line (K45) with a t (X; 11) (ql3 ;q23 ) 
translocation. The blot was sequentially hybridized to 
the 0.8E, 0.7B and 1.5EB cDNA fragments of the MLL gene. 
Hybridization to act in is also shown. The markers on the 
25 right denote the size of the detected transcripts, and 
the lines to the right of the blots locate the altered 
MLL transcripts seen in the Karpas 45 cell line. 

3 0 DETAILED DESCRIPTION 

OF THE PREFERRED EMBODIMENTS 

Introduction 

The molecular analysis of recurring structural 
3 5 chromosome abnormalities in human neoplasia has led to 

the identification of a number of genes involved in these 
rearrangements. These genetic alterations are implicated 
in the development of malignancies. For example, in 
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chronic myelogenous leukemia, the proto-oncogene ABL is 
translocated from chromosome 9 to the BCR gene on 
chromosome 2 2 leading to the generation of a chimeric 
gene and a fusion protein (Rowley, 1990b) . In lymphoid 
5 malignancies, translocations frequently involve the 
immunoglobulin or T-cell receptor genes which are 
juxtaposed to key oncogenes causing their abnormal 
expression (Rowley, 1990a) . 

10 Translocations involving chromosome band llq23 have 

been identified as a frequent cytogenetic abnormality in 
lymphoid and myeloid leukemias and in lymphomas 
(Sandberg, 1990) . In addition to leukemias that occur de 
novo, llq23 translocations are also observed in therapy 

15 related leukemias. The t(4;ll) has been reported in 2% 
to 7% of all cases of acute lymphoblastic leukemia (ALL) 
and in up to 60% of leukemias in children under the age 
of one year (Parkin et al . , 1982; Pui et al • , 1991; 
Kaneko et al . , 1988). By French-American-British (FAB) 

2 0 Cooperative Group criteria, these leukemias are usually 
classified morphologically as LI. Typically, these 
patients express myeloid or monocytoid markers in 
addition to the B-cell lymphoid markers (Kaneko et al . , 
1988; Drexler et al . , 1991). On flow cytometry, a 

25 characteristic phenotype, CD 10", CD 15 + , CD 19 4 *, CD 24" 
/+ , has been reported (Pui et al . , 1991). These patients 
often present with hyperleukocytosis and early central 
nervous system involvement (Arthur et al . , 1982). 

30 The t(ll;19) is more complex because two 

translocations involving different breakpoints in 19p 
with different phenotypic features have been identified. 
Approximately two-thirds have a t (11; 19) (q23 ;pl3 . 3) and 
include patients with ALL, biphenotypic leukemia, and 

35 infants or young children with AML. One-third have a 

t(ll;19) (q23;pl3.1) and are generally older children or 
adults with AML-M4 and M5 . The t(4;ll) and the t(ll;l9) 
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have been recognized as a cytogenetic subset in ALL with 
a poor prognosis (Gibbons et al . , 1990). 

Translocations involving llq23 are frequent in acute 
5 myeloid leukemia (AML) and have also been found to occur 
preferentially in childhood (Fourth Int. Wksh. Cancer 
Gent. Cytogenet., 1984). The t(9;ll) and both t(ll;19) 
are the most common, but other rearrangements, such as 
the t(6;ll), an insertion (10;11) , and deletions 

10 involving llq23 have also been reported (Mitelman et al . , 
1991) . Morphologically these cases are usually 
categorized as acute myelomonocytic leukemia (AML-M4) or 
acute monoblastic leukemia (AML-M5) by FAB criteria. 
Similar to ALL, these patients often present with high 

15 leukemic blast cell counts. Ilq2 3 abnormalities have 
generally been considered to carry a poor prognosis in 
AML (Fourth Int. Wksh. Cancer Genet. Cytogenet., 19 84). 
However, the use of intensive chemotherapy in these 
patients has led to complete remission rates and 

2 0 remission durations that are similar to a group with 

favorable cytogenetic abnormalities (Samuels et al . , 
1988) . Many cases of AML with llq23 anomalies have been 
found, by flow cytometry, to express lymphoid markers 
(Cuneo et al . , 1992). 

25 

Abnormalities of llq2 3 have been found to be common 
in both the lymphoid arid myeloid leukemias as well as in 
biphenotypic leukemias which have both lymphoid and 
myeloid features (Hudson et al . , 1991). This has led to 

3 0 the hypothesis that rearrangements of a gene at llq2 3 may 

affect a pluripotential progenitor cell capable of either 
myeloid or lymphoid differentiation. Alternatively, a 
mechanism for differentiation that is shared by both 
lymphoid and myelo-monocytic stem cells may be 
35 deregulated as a consequence of these translocations. 
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DNA Segments and Nucleic Acid Hybridization 

As used herein, the term "DNA segment" in intended 
to refer to a DNA molecule which has been isolated free 
of total genomic DNA of a particular species. Therefore, 
5 DNA segments of the present invention will generally be 
MLL DNA segments which are isolated away from total human 
genomic DNA, although DNA segments isolated from other 
species, such as, e.g., Drosophila, may also be included 
in certain embodiments. Included within the term "DNA 
10 segment", are DNA segments which may be employed as 

probes, and those for use in the preparation of vectors, 
as well as the vectors themselves, including, for 
example, plasmids, cosmids, phage, viruses, and the like. 

15 The techniques described in the following detailed 

examples are the generally preferred techniques for use 
in connection with certain preferred embodiments of the 
present invention. However, in that this invention 
concerns nucleic acid sequences and DNA segments, it will 

20 be apparent to those of skill in the art that this 

discovery may be used in a wide variety of molecular 
biological embodiments . 

The DNA sequences disclosed herein will also find 
2 5 utility as probes or primers in modifications of the 

nucleic acid hybridization embodiments detailed in the 
following examples. As such, it is contemplated that 
oligonucleotide fragments corresponding to any of the 
cDNA or genomic sequences disclosed herein for stretches 
30 of between about 10 nucleotides to about 20 or to about 
30 nucleotides will have utility, with even longer 
sequences, e.g., 40, 50 or 100 bases, 1 kb, 2 kb or 4 kb, 
8.3 kb, 20 kb, 30 kb, 50 kb or even up to about 100 kb or 
more also having utility. The larger sized DNA segments 
35 in the order of about 20, 30, 50 or about 100 kb or even 
more, are contemplated to be useful in FISh .embodiments. 
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Th e ability of such nucleic acid probes to 
specifically hybridize to #LL-encoding or other MLL 
genomic sequences will enable them to be of use in a 
variety of embodiments. For example, the probes can be 
5 used in a variety of assays for detecting the presence of 
complementary sequences in a given sample* However, 
other uses are envisioned, including the use of the 
sequence information for mapping the precise breakpoints 
in individual patients, and for the preparation of mutant 
10 species primers or primers for use in preparing other 
genetic constructions. 

Nucleic acid molecules having stretches of 10, 20, 
30, 50, 100, 200, 500 or 1000 or so nucleotides or even 
more, in accordance with or complementary to any of seq 
id no:l through seq id no: 6 will have utility as 
hybridization probes. These probes will be useful in a 
variety of hybridization embodiments, not only in 
Southern and Northern blotting in connection with 
analyzing patients' genes, but also in analyzing normal 
hematopoietic development and in charting the evolution 
of certain genes. The total size of fragment used, as 
well as the size of the complementary stretch (es) , will 
ultimately depend on the intended use or application of 
the particular nucleic acid segment. Smaller fragments 
will generally find use in hybridization embodiments, 
wherein the length of the complementary region may be 
varied, such as between about 10 and about 100 
nucleotides, up to 0.7 kb, 1.3 kb or 1.5 kb or even up to 
8.3 kb or more, according to the complementary sequences 
one wishes to detect. 

The use of a hybridization probe of about 10 
nucleotides in length allows the formation of a duplex 
35 molecule that is both stable and selective. Molecules 
having complementary sequences over stretches greater 
than 10 bases in length are generally preferred, though, 
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in order to increase stability and selectivity of the 
hybrid, and thereby improve the quality and degree of 
specific hybrid molecules obtained . One will generally 
prefer to design nucleic acid molecules having gene- 
5 complementary stretches of 15 to 20 nucleotides, or even 
longer where desired* Such fragments may be readily 
prepared by, for example, directly synthesizing the 
fragment by chemical means, by application of nucleic 
acid reproduction technology, such as the PCR technology 
10 of U.S. Patent 4,603,102 (herein incorporated by 

reference) or by introducing selected sequences into 
recombinant vectors for recombinant production. 

Accordingly, the nucleotide sequences of the 
15 invention may be used for their ability to selectively 
form duplex molecules with complementary stretches of- 
ML-L-like genes or cDNAs . Depending on the application 
envisioned, one will desire to employ varying conditions 
of hybridization to achieve varying degrees of 

2 0 selectivity of probe towards target sequence. For 

applications requiring high selectivity, one will 
typically desire to employ relatively stringent 
conditions to form the hybrids, e.g., one will select 
relatively low salt and\or high temperature conditions, 
25 such as provided by 0.02M-0.15M NaCl at temperatures of 

50°C to 70°C. Such selective conditions tolerate little, 
if any, mismatch between the probe and the template or 
target strand, and would be particularly suitable for 
isolating ATii-like genes, for example, to gather 

3 0 information on the gene in different cell types or at 

different stages of the cell's cycle. 

Of course, for some applications, for example, where 
one desires to prepare mutants employing a mutant primer 
3 5 strand hybridized to an underlying template or where one 
seeks to isolate MlrJD-encoding sequences from related 
species, functional equivalents, or the like, less 



WO 93/25713 



PCT/US93/05857 



-26- 

stringent hybridization conditions will typically be 
needed in order to allow formation of the heteroduplex. 
In these circumstances, one may desire to employ 
conditions such as 0.15M-0.9M salt, at temperatures 
5 ranging from 20°C to 55°C. Cross-hybridizing species can 
thereby be readily identified as positively hybridizing 
signals with respect to control hybridizations. In any 
case, it is generally appreciated that conditions can be 
rendered more stringent by the addition of increasing 

10 amounts of formamide, which serves to destabilize the 
hybrid duplex in the same manner as increased 
temperature. Thus, hybridization conditions can be 
readily manipulated, and thus will generally be a method 
of choice depending on the desired results. Less 

15 stringent conditions would be suitable for identifying 
related genes, such as, for example, further drosophila 
or yeast genes, or genes from any organism known to be 
interesting from an evolutionary or developmentally stand 
point. 

20 

In certain embodiments, it will be advantageous to 
employ nucleic acid sequences of the present invention in 
combination with an appropriate means, such as a label, 
for determining hybridization. A wide variety of 

25 appropriate indicator means are known in the art, 

including fluorescent, radioactive, enzymatic or other 
ligands, such as avidin/biotin, which are capable of 
giving a detectable signal. In preferred embodiments, 
one will likely desire to employ a fluorescent label or 

3 0 an enzyme tag, such as urease, alkaline phosphatase or 

peroxidase, instead of radioactive or other environmental 
undesirable reagents. In the case of enzyme tags, 
colorimetric indicator substrates are known which can be 
employed to provide a means visible to the human eye or 

3 5 spectrophotometrically , to identify specific 

hybridization with complementary nucleic acid-containing 
samples. 
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In general, it is envisioned that the hybridization 
probes described herein will be useful both as reagents 
in solution hybridization as well as in embodiments 
employing a solid phase. In embodiments involving a 
5 solid phase, the test DNA (or RNA) is adsorbed or 

otherwise affixed to a selected matrix or surface. This 
fixed, single-stranded nucleic acid is then subjected to 
specific hybridization with selected probes under desired 
conditions. The selected conditions will depend on the 

10 particular circumstances based on the particular criteria 
required (depending, for example, on the G+C contents, 
type of target nucleic acid, source of nucleic acid, size 
of hybridization probe, etc.). Following washing of the 
hybridized surface so as to remove nonspecif ically^ bound 

15 probe molecules, specific hybridization is detected, or 
even quantified, by means of the label. 

It is contemplated that longer DNA segments will, 
find utility in the recombinant production of peptides or 

2 0 proteins. DNA segments which encode peptides of from 

about 15 to about 50 amino acids in length, or more 
preferably, from about 15 to about 3 0 amino acids in : 
length are contemplated to be particularly useful in — 
certain embodiments, e.g., in raising anti-peptide 
25 antibodies. DNA segments encoding larger polypeptides, 
domains, fusion proteins or the entire MLL protein will 
also be useful. DNA segments encoding peptides will 
generally have a minimum coding length in the order of 
about 4 5 to about 9 0 or 150 nucleotides, whereas DNA 

3 0 segments encoding larger MLL proteins, polypeptides, 

domains or fusion proteins may have coding segments 
encoding about 350, 430 or about 650 amino acids, and may 
be about 1.2 kb, 4 . lkb or even about 8.3kb in length. 

3 5 The nucleic acid segments of the present invention, 

regardless of the length of the coding sequence itself, 
may be combined with other DNA sequences, such as 
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promoters, polyadenylation signals, additional 
restriction enzyme sites, multiple cloning sites, other 
coding segments, and the like, such that their overall 
length may vary considerably. It is contemplated that a 
5 nucleic acid fragment of almost any length may be 

employed, with the total length preferably being limited 
by the ease of preparation and use in the intended 
recombinant DNA protocol- For example, nucleic acid 
fragments may be prepared in accordance with the present 
10 invention which are up to 20,000 base pairs in length, as 
may segments of 10,000, 5,000 or about 3,000, or of about 
1,000 base pairs in length or less. 

It will be understood that this invention is not 
15 limited to the particular nucleic and amino acid 

sequences of seq id nos:l through 6 and seq id nos:7 and 
8, respectively. Therefore, DNA segments prepared in 
accordance with the present invention may also encode 
biologically functional equivalent proteins or peptides 
2 0 which have variant amino acids sequences. Such sequences 
may arise as a consequence of codon redundancy and 
functional equivalency which are known to occur naturally 
within nucleic acid sequences and the proteins thus 
encoded. Alternatively, functionally equivalent proteins 

2 5 or peptides may be created via the application of 

recombinant DNA technology, in which changes in the 
protein structure may be engineered, based on 
considerations of the properties of the amino acids being 
exchanged . 

30 

DNA segments encoding an MLL gene may be introduced 
into recombinant host cells and employed for expressing 
the encoded protein. Alternatively, through the 
application of genetic engineering techniques, 

3 5 subportions or derivatives of selected MLL genes may be 

employed. Equally, through the application of site- 
directed mutagenesis techniques, one may re-engineer DNA 
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segments of the present invention to alter the coding 
sequence, e.g., to introduce improvements to the 
antigenicity of the protein or to test MLL protein 
mutants in order to examine the structure-function 
5 relationships at the molecular level. Where desired, one 
may also prepare fusion peptides, e.g., where the MLL 
coding regions are aligned within the same expression 
unit with other proteins or peptides having desired 
functions, such as for immunodetection purposes (e.g., 
10 enzyme label coding regions) , for stability purposes, for 
purification or purification and cleavage, or to impart 
any other desirable characteristic to an MLL-based fusion 
product. 

15 MLL Protein Expression, Purification and Uses 

In certain embodiments, DNA segments encoding MLL 
protein portions may be produced and employed to express 
the MLL proteins, domains or fusions thereof. Such DNA 
segments will generally encode proteins including MLL 
2 0 amino acid sequences of between about 100, 200, 250/ 300 
or about 650 amino acids, although longer sequences up to 
and including about 3800 or 3968 MLL amino acids are also 
contemplated. MLL protein regions which are both 
telomeric and centromeric to the breakpoint region may be 

2 5 produced, as exemplified herein by the generation of 

fusion proteins including MLL amino acids set forth in 
seq id no: 8 and by amino acids 323-623 of seq id no: 7. 
Other specific regions contemplated by the inventors to 
be particularly useful include, for example, the zinc 
30 finger regions represented by amino acids 574-1184, and 
more particularly, those including amino acids 574 to 
about 810 and about 1057 to 1184 of seq id no:7. 

As a point of comparison with other nomenclature 

3 5 currently used in the art, the MLL amino acids of clone 

14-7 (seq id no:8), telomeric to the breakpoint region, 
correspond to the HRX amino acids 2772-3209 in Figure 4 
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of Tkachuk et al . (1992), and the MLL amino acids 323-623 
of clone 14P-18B (seq id no:7), centromeric to the 
breakpoint region, correspond to the HRX amino acids 
1101-1400 (TKachuk et al . , 1992). It should also be 
5 noted here that the cDNA clone 14P-18B (seq id no: 4) 
differs from the published sequence of Tkachuk et al . 
(1992) in that clone 14P-18B lacks exon 8 sequences. 
This arose as a result of using a cDNA obtained 
subsequent to an alternative splicing reaction. Such 
10 alternative splicing is known to occur in other zinc 

finger proteins, such as the Wilms tumor protein. The 
zinc finger regions in the Tkachuk et al . sequence are 
represented generally by amino acids 1350-1700 and 1700- 
2000. 

15 

The expression and purification of MLL proteins is 
exemplified herein by the generation of MLL fusion 
proteins including glutathione S transferase, by their 
expression in E. coll, and by the use of glutathione- 

20 agarose affinity chromatography. However, it will be 

understood that there are many methods available for the 
recombinant expression of proteins and peptides, any or 
all of which will likely be suitable for use in 
accordance with the present invention. MLL proteins may 

25 be expressed in both eukaryotic and prokaryotic 

recombinant host cells, although it is believed that 
bacterial expression has advantages over eukaryotic 
expression in terms of ease of use and quantity of 
materials obtained thereby. 

30 

MLL proteins and peptides produced in accordance 
with the present invention may contain only MLL sequences 
themselves or may contain MLL sequences linked to other 
protein or peptide sequences. The MLL segments may be 
35 linked to other x natural' sequences, such as those 

derived from other chromosomes, and also to A engineered' 
protein or peptide sequences, such as glutathione-S- 
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transf erase (GST) , ubiquitin, B-galactosidase, 
6-lactamase, antibody domains and, infact, virtually any 
protein or peptide sequence which one desires. The use 
of enzyme sensitive peptide sequences, such as , e.g., 
5 those found in the blood clotting cascade proteins, is 

also contemplated. One such application involves the use 
of a fusion protein domain for purification, e.g., using 
affinity chromatography, and then the subsequent cleavage 
of the fusion protein by a specific enzyme to release the 
10 MLL portion of the fusion protein. 

As used herein, the term "engineered" or 
"recombinant" cell is intended to refer to a eukaryotic 
or prokaryotic cell into which a recombinant MLL DNA 

15 segment has been introduced. Therefore, engineered cells 
are distinguishable from naturally occurring cells which 
do not contain recombinantly introduced DNA, i.e., DNA 
introduced through the hand of man. Recombinantly 
introduced DNA segments will generally be in the form of 

20 cDNA (i.e., they will not contain introns) , although the 
use of genomic MLL sequences is not excluded. 

For protein expression, one would position the 
coding sequences adjacent to and under the control of a 

25 promoter. It is understood in the art that to bring a 
coding sequence under the control of a promoter, one 
positions the 5' end of the transcription initiation site 
of the transcriptional reading frame of the protein 
between about 1 and about 50 nucleotides "downstream" of 

30 (i.e., 3' of) the chosen promoter. Where eukaryotic 
expression is contemplated, one will also typically 
desire to incorporate into the transcriptional unit an 
appropriate polyadenylation site (e.g., 5 ' -AATAAA-3 ' ) if 
one was not contained within the original cloned segment. 

3 5 Typically, the poly A addition site is placed about 3 0 to 
2 000 nucleotides "downstream" of the termination site of 
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the protein at a position prior to transcription 
termination. 

The promoters used will generally be recombinant or 
5 heterologous promoters. As used herein, a recombinant or 
heterologous promoter is intended to refer to a promoter 
that is not normally associated with a the MLL gene in 
its natural environment. Such promoters may include 
virtually any promoter isolated from any bacterial or 

10 eukaryotic cell. Naturally, it will be important to 

employ a promoter that effectively directs the expression 
of the DNA segment in the cell type chosen for 
expression. The use of promoter and cell type 
combinations for protein expression is generally known to 

15 those of skill in the art of molecular biology, for 
example, see Sambrook et al . (1989). The promoters 
employed may be constitutive, or inducible, and can be 
used under the appropriate conditions to direct high 
level expression of the introduced DNA segment, such as 

20 is advantageous in the large-scale production of 
recombinant proteins or peptides. 

Further aspects of the present invention concern the 
purification or substantial purification of MLL-based 

2 5 proteins. The term "purified" as used herein, is 

intended to refer to a composition which includes a 
protein incorporating an MLL amino acid sequence, wherein 
the protein is purified to any degree relative to its 
naturally-obtainable state. The "naturally-obtainable 

3 0 state" may be relative to the purity within a human cell 

or cell extract, e.g., for an MLL fusion protein produced 
in leukemic cells of a given patient, or may be relative 
to the purity within an engineered cell or cell extract, 
e.g., for a man-made MLL fusion protein. 



35 



Generally, "purified" will refer to an MLL protein 
or MLL peptide composition which has been subjected to 
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f ractionation to remove various non-MLL protein 
components such as other cell components. Various 
techniques suitable for use in protein purification will 
be well known to those of skill in the art. These 
5 include, for example, precipitation with ammonium 
sulphate, PEG, antibodies and the like or by heat 
denaturation, followed by centrifugation; chromatography 
steps such as ion exchange, gel filtration, reverse 
phase, hydroxy lapatite and affinity chromatography; 

10 isoelectric focusing; gel electrophoresis; and 

combinations of such and other techniques. A specific 
example presented herein is the purification of MLL: GST 
fusion proteins using glutathione-agarose affinity 
chromatography, followed by preparative SDS- 

15 polyacrylamide gel electrophoresis and electroelution. 

The recombinant peptides or proteins produced from 
the DNA segments of the present invention will have uses 
in a variety of embodiments. For example, peptides, - 
2 0 polypeptides and full-length proteins may be employed- in 
the generation of antibodies directed against the MLL 
protein and antigenic sub-portions of the protein. 
Techniques for the production of polyclonal and ~ 
monoclonal antibodies are described hereinbelow and are 

2 5 well known to those of skill in the art. The production 

of antibodies would be particularly useful as this would 
enable further detailed analyses of the location and 
function of the MLL protein, and MLL-related species, 
which clearly have an important role in mammalian cells 

3 0 and other cell types. The proteins may also be employed 

in various assays, such as DNA binding assays, and 
proteins and peptides may be employed to define the 
precise regions of the MLL protein which interact with 
targets, such as DNA, receptors, enzymes, substrates, and 
3 5 the like. 
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Recombinant Host Cells and Vectors 

Prokaryotic hosts are generally preferred for 
expression of MLL proteins. Examples of useful 
prokaryotic hosts include E. coli, such as strain JM101 
5 which is particularly useful, Bacillus subtilis , 

Salmonella typhimurium , Serratia marcescens , and various 
Pseudomonas species. In general, plasmid vectors 
containing replicon and control sequences which are 
derived from species compatible with the host cell should 

10 be used in connection with these hosts. Such vectors 
ordinarily carry a replication site and a compatible 
promoter as well as marking sequences which are capable 
of providing phenotypic selection in transformed cells, 
such as genes for ampicillin or tetracycline resistance. 

15 Those promoters most commonly used in recombinant DNA 

construction include the B-lactamase (penicillinase) and 
lactose promoter systems and the tryptophan (trp) 
promoter system. 

2 0 In addition to prokaryotes, eukaryotic microbes, 

such as yeast cultures may also be used. Saccharomyces 
cerevisiae (common baker's yeast) is the most commonly 
used among eukaryotic microorganisms, although a number 
of other strains are commonly available. For expression 
25 in Saccharomyces , the plasmid YRp7 , containing the trpl 
gene is commonly used. Suitable promoting sequences in 
yeast vectors include the promoters for 
3-phosphoglycerate kinase or other glycolytic enzymes 
such as enolase, glyceraldehyde-3-phosphate 

3 0 dehydrogenase, hexokinase, pyruvate decarboxylase, 

phosphof ructokinase , glucose-6-phosphate isomerase , 3 - 
phosphoglycerate mutase, pyruvate kinase, triosephosphate 
isomerase, phosphoglucose isomerase, and glucokinase. In 
constructing suitable expression plasmids, the 
3 5 termination sequences associated with these genes are 
also ligated into the expression vector 3' of the 
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sequence desired to be expressed to provide 
polyadenylation of the mRNA and termination. 

Other promoters, which have the additional advantage 
5 of transcription controlled by growth conditions are the 
promoter region for alcohol dehydrogenase 2, 
isocytochrome C, acid phosphatase, degradative enzymes 
associated with nitrogen metabolism, and the 
aforementioned glyceraldehyde-3 -phosphate dehydrogenase, 
10 and enzymes responsible for maltose and galactose 

utilization. Any plasmid vector containing a yeast- 
compatible promoter, an origin of replication, and 
termination sequences is suitable. 

15 In addition to microorganisms, cultures of cells 

derived from multicellular (eukaryotic) organisms may 
also be used as hosts. In principle, any such cell 
culture is workable, whether from vertebrate or 
invertebrate culture. However, interest has been 

2 0 greatest in vertebrate cells, and propagation of 

vertebrate cells in culture (tissue culture) has become a 
routine procedure in recent years. Examples of such 
useful host cell lines are VERO and HeLa cells, Chinese 
hamster ovary (CHO) cell lines, and W138, BHK, COS-7 , 293 
25 and MDCK cell lines. Expression vectors for such cells 
ordinarily include (if necessary) an origin of 
replication, a promoter located in front of the gene to 
be expressed, along with any necessary ribosome binding 
sites, RNA splice sites, polyadenylation site, and 

3 0 transcriptional terminator sequences. 

. For use in mammalian cells, the control functions on 
the expression vectors are often provided by viral 
material. For example, commonly used promoters are 
35 derived from polyoma, Adenovirus 2, and most frequently 
Simian Virus 40 (SV40) . The early and late promoters of 
SV4 0 virus are particularly useful because both are 
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obtained easily from the virus as a fragment which also 
contains the SV40 viral origin of replication. Smaller 
or larger SV4 0 fragments may also be used, as may 
adenoviral vectors which are known to be particularly 
5 useful recombinant tools. 

The origin of replication may be provided either by 
construction of the vector to include an exogenous 
origin, such as may be derived from SV4 0 or other v viral 
10 (e.g., Polyoma, Adeno, VSV, BPV) source, or may be 
provided by the host cell chromosomal replication 
mechanism. If the vector is integrated into the host 
cell chromosome, the latter is often sufficient. 

15 Biological Functional Equivalents 

As is known in the art, modification and changes may 
be made in protein structure and still obtain a molecule 
having like or otherwise desirable characteristics. For 
example, certain amino acids may be substituted for other 
2 0 amino acids in a protein structure without appreciable 

loss of interactive binding capacity with structures such 
as, for example, DNA, enzymes and substrate molecules. 
Since it is the interactive capacity and nature of a 
protein that defines that protein's biological functional 

2 5 activity, certain amino acid sequence substitutions can 

be made in a protein sequence (or, of course, its 
underlying DNA coding sequence) and nevertheless obtain a 
protein with like or even countervailing properties 
(e.g., antagonistic v. agonistic). The present invention 

3 0 thus encompasses MLL proteins and peptides including 

certain sequences changes. 

In making conservative changes, the hydropathic 
index of amino acids may be considered. The importance 
3 5 of the hydropathic amino acid index in conferring 

interactive biologic function on a protein is generally 
understood in the art (Kyte & Doolittle, 1982) and it is 
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known that. certain amino acids may be substituted for 
other amino acids having a similar hydropathic index or 
score and still result in a protein with similar 
biological activity. Each amino acid has been assigned a 
5 hydropathic index on the basis of their hydrophobicity 

and charge characteristics, these are: isoleucine (+4.5); 
valine (+4.2); leucine (+3.8); phenylalanine (+2.8); 
cysteine/cystine (+2.5); methionine (+1.9); alanine 
(+1.8); glycine (-0.4); threonine (-0.7); serine (-0.8); 

10 tryptophan (-0.9); tyrosine (-1.3); proline (-1.6); 

histidine (-3.2); glutamate (-3.5); glutamine (-3.5); 
aspartate (-3.5); asparagine (-3.5); lysine (-3.9); and 
arginine (-4.5). In making changes, the substitution of 
amino acids whose hydropathic indices are within +2 is 

15 preferred, those which are within +1 are particularly 
preferred, and those within ±0.5 are even more 
particularly preferred. 

Substitution of like amino acids can also be made on 
2 0 the basis of hydrophilicity , particularly where the 
biological functional equivalent protein or peptide 
thereby created is intended for use in immunological 
embodiments. U.S. Patent 4,554,101, incorporated herein 
by reference, states that the greatest local average 
25 hydrophilicity of a protein, as governed by the 

hydrophilicity of its adjacent amino acids, correlates 
with its immunogenic ity and antigenicity, i.e. with a 
biological property of the protein. 

30 As detailed in U.S. Patent 4,554,101, the following 

hydrophilicity values have been assigned to amino acid 
residues: arginine (+3.0); lysine (+3.0); aspartate (+3.0 
±1); glutamate (+3.0 ± 1); serine (+0.3); asparagine 
(+0.2) ; glutamine (+0.2) ; glycine (0) ; threonine (-0.4) ; 

35 proline (-0.5 ± 1); alanine (-0.5); histidine (-0.5); 
cysteine (-1.0); methionine (-1.3); valine (-1.5); 
leucine (-1.8); isoleucine (-1.8); tyrosine (-2.3); 
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phenylalanine (-2.5); tryptophan (-3.4)- It is 
understood that an amino acid can be substituted for 
another having a similar hydrophilicity value and still 
obtain a biologically equivalent, and in particular, an 
5 immunologically equivalent protein. In such changes, the 
substitution of amino acids whose hydrophilicity values 
are within ±2 is preferred, those which are within ±1 are 
particularly preferred, and those within ±0.5 are even 
more particularly preferred. 

10 

As outlined above, amino acid substitutions are 
generally therefore based on the relative similarity of 
the amino acid side-chain substituents , for example, 
their hydrophobicity , hydrophilicity, charge, size, and 

15 the like. Exemplary substitutions which take various of 
the foregoing characteristics into consideration are well 
known to those of skill in the art and include: arginine 
and lysine; glutamate and aspartate; serine and 
threonine; glutamine and asparagine; and valine, leucine 

20 and isoleucine. 

While discussion has focused on functionally 
equivalent polypeptides arising from amino acid changes, 
it will be appreciated that these changes may be effected 
25 by alteration of the encoding DNA; taking into 

consideration also that the genetic code is degenerate 
and that two or more codons may code for the same amino 
acid. 

3 0 Antibody Generation 

As disclosed hereinbelow (see Example IV) , now that 
the inventors have made possible the production of 
various MLL proteins, the generation of antibodies is a 
relatively straightforward matter. Antibody generation 
3 5 is generally known to those of skill in the art and many 
experimental animals are available for such purposes. 
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In addition to the polyclonal antisera described 
herein, the inventors also contemplate the production of 
specific monoclonal antibodies. Monoclonal antibodies 
(MAbs) specific for the MLL protein of the present 
5 invention may be prepared using conventional techniques. 
Initially, an MLL-containing composition would be used 
to immunize an experimental animal, such as a mouse, from 
which a population of spleen or lymph cells would be 
obtained. The spleen or lymph cells would then* be fused 
10 with cell lines, such as human or mouse myeloma strains, 
to produce antibody-secreting hybridomas. These 
hybridomas may be isolated to obtain individual clones 
which can then be screened for production of antibody to 
the desired MLL protein. 

15 

For fusing spleen and myeloma or plasmacytoma cells 
to produce hybridomas secreting monoclonal antibodies 
against MLL, any of the standard fusion protocols may be 
employed, such as those described in, e.g., The Cold 
2 0 Spring Harbor Manual for Hybridoma Development, 

incorporated herein by reference. Hybridomas which 
produce monoclonal antibodies to the selected MLL, antigen 
would then be identified using standard techniques, such 
as ELISA and Western blot methods. Hybridoma clones can 

2 5 then be cultured in liquid media and the culture 

supernatants purified to provide MLL-specif ic monoclonal 
antibodies. 

Epitopic Core Sequences 

3 0 The present invention also makes possible the 

identification -of epitopic core sequences from the MLL 
protein, as based on the deduced :nino acid sequence 
encoded by the MLL gene. The identification of MLL 
epitopes directly from the primary sequence, and their 
3 5 epitopic equivalents, is a relatively straightforward 
matter known to those of skill in the art. In 
particular, it is contemplated that one would employ the 
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' methods of Hopp, as taught in U.S. Patent 4,554,101, 
incorporated herein by reference, which teaches both the 
identification of epitopes from amino acid sequences on 
the basis of . hydrophilicity , and the selection of 
5 biological functional equivalents of such sequences. The 
methods described in several other papers, and software 
programs based thereon, can also be used to identify 
epitopic core sequences, for example, the Jameson and 
Wolf computer programs and the Kyte analyses may also be . 
10 employed (Jameson & Wolf, 1988; Wolf et al . , 1988; Kyte & 
Doolittle, 1982). 

The amino acid sequence of an "epitopic core 
sequence" thus identified may be readily incorporated 

15 into peptides, either through the application of peptide 
synthesis or recombinant technology. As mentioned above, 
preferred peptides for use in accordance with the present 
invention will generally be on the order of 15 to 50 
amino acids in length, and more preferably about 15 to 

2 0 about 3 0 amino acids in length. It is proposed that 

shorter antigenic peptides which incorporate epitopes of 
the MLL protein will provide advantages in certain 
circumstances, for example, in the preparation of 
antibodies or in immunological detection assays. 

2 5 Exemplary advantages include the ease of preparation and 

purification, the relatively low cost and improved 
reproducibility of production, and advantageous 
biodistr ibution . 

3 0 The MLL Gene 

The present inventors recently identified a yeast 
artificial chromosome (YAC). that contains the breakpoint 
region in leukemias with the most common reciprocal 
translocations involving this chromosomal band, namely 
35 t(4;ll), t(6;ll), t(9;ll), and t(ll;19), (Rowley et al . , 
1990) . They identified a gene termed MLL, for mixed 
lineage leukemia or myeloid/ lymphoid leukemia, that spans 
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the breakpoint on llq23 (Ziemin-van Der Poel et al . , 

1991) . This same gene is also referred to as ALL-2 
(Cimino et al . , 1991; Gu et al., 1992a;b) , Htrx (Djabali 
et al., 1992) and HRX (Tkachuk et al . , 1992) by other 

5 workers in the field, although MLL is the accepted 

designation for this gene adopted by the human genome 
nomenclature committee (Chromosome Co-ordinating Meeting, 

1992) . 

10 Recent data indicate that the breakpoint in a cell 

line, RC-K8 with a t (11 ; 14 ) (q23 ;q32 ) , is approximately 
110 kb telomeric to the breakpoint in other llq23 
translocations which involve the MLL gene (Akao et al ♦ , 
1991b; Lu & Yunis, 1992; Radice & Tunnacliffe, 1992). 

15 The present inventors propose that there are at least two 
different regions of band q23 involved in chromosome 
llq2 3 translocations; and distinguish these by using the 
term more centromeric to designate MLL rearrangements 
from those involving the more telomeric breakpoint - 

2 0 which has been described as the RCK locus (Akao et al . , 
1991b) or the p54 gene (Lu & Yunis, 1992) . 

Using pulse field gel electrophoresis analyses, the 
breakpoint region in MLL was mapped to a 92 kb NotI 

2 5 fragment approximately 100 kb telomeric to the CD3G gene. 

Non-repetitive sequences from three genomic clones 
isolated from this region detected transcripts in the 
estimated 11-12.5 kb size range (normal mRNA) in normal 
cells, and in the cell line, RS4;11 with a t(4;ll), two 

3 0 highly expressed transcripts whose estimated size was 

11.0 and 11-5 kb (rearranged mRNA) were detected (Ziemin- 
van Der Poel et al . , 1991). It should be noted that the 
size of these transcripts has been estimated from 
measurements on Northern blots. In this size range, 
3 5 i.e., above about 10 kb, the resolution of agarose gels 

is known to be poorer, and hence size determinations made 
in this manner may be over- or under-estimates, and be 
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found to vary about 2 or 3 kb or so, as has been reported 
by other groups for the MLL gene (Cimino et al . , 1991; 
1992) . 

5 Improved MLL Probes 

Presented herein is evidence that the breakpoints in 
the t(4;ll), t(6;ll), t(9;ll), and t(ll;19) 
translocations are clustered within a 9 kb BamHl genomic 
region of the MLL gene, which has been more precisely 

10 defined, by sequencing, as being 8.3 kb in length. Using 
a 0.7 kb BajnHl cDNA fragment of the MLL gene called MLL 
0.7B (seq id no:l), rearrangements on Southern analyses 
of DNA from cell lines and patient material with an llq23 
translocation were detected in this region. Probe MLL 

15 0.7B (seq id no:l) is derived from a cDNA clone that 

lacks Exon 8 sequences, but this clearly has no adverse 
effects on breakpoint detection using this probe, which 
is still the most advantageous probe identified to date. 

20 

Northern blotting analyses of the MLL gene are also 
presented herein. These results demonstrate that the MLL 
gene has multiple transcripts, some of which appear to be 
lineage specific. In normal pre-B cells, four normal 

25 inRNA transcripts estimated to be of about 12.5, 12.0, 

11.5 and 2.0 kb in size are detected. These transcripts 
are also present in monocytoid cell lines with additional 
hybridization to an estimated 5.0 kb normal inRNA 
transcript, indicating that expression of different sized 

3 0 MLL transcripts may be associated with normal 
hematopoietic lineage development. 

In a cell line with a t(4;ll), the expression of the 
large 12.5, 12.0 and 11.5 kb transcripts is reduced, and 
35 there is evidence of three other altered mRNA transcripts 
estimated to be of 11.5, 11.25 and 11.0 kb. In the 
Karpas 45 cell line (K45) , with a t (X; 11) (ql3 ;q23 ) 
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translocation, aberrant mRNA transcripts with estimated 
sizes of about 8 kb and about 6 kb, were detected* These 
translocations result in rearrangements of the MLL gene 
and may lead to altered function (s) of the MLL gene as 
5 well as that of other gene(s) involved in the 
translocation . 

In further studies, unique sequences from the 0.7 
kilobase BamHl fragment, corresponding to the centromeric 

10 and telomeric ends of the 8,3 kilobase germline fragment, 
were amplified by the polymerase chain reaction (PCR) and 
were used as probes to distinguish the chromosomal origin 
of rearranged bands on Southern blot analysis. Patient 
samples were selected on the basis of a karyotype 

15 containing an llq23 abnormality and the availability of 
cryopreserved bone marrow or peripheral blood. 61 
patients with acute leukemia and llq2 3 aberrations, three 
cell lines derived from such patients, and 20 patients 
with non-Hodgkins lymphomas were analyzed. 

20 

It was found that the 0.7 kilobase cDNA fragment 
(seq id no:l) detected DNA rearrangements with a single 
BamHl digest in 58 leukemia patients and three cell lines 
with llq23 abnormalities. This includes all cases (46 

2 5 patients and two cell lines) with the common llq2 3 

translocations involving chromosomes 4, 6, 9, and 19. In 
addition, rearrangements were identified in i6 other 
cases with llq23 anomalies, including translocations, 
insertions, and inversions. Rearrangements were not 

3 0 detected in three patients with leukemia and uncommon 

llq23 translocations. Three of the 2 0 patients with 
lymphoma also had rearrangements. All of these breaks 
are first shown to occur within a 9 kilobase breakpoint 
cluster region, later identified as occurring within a 
35 region only 8.3 kb in length. Nineteen different 

chromosome breakpoints were associated with the MLL gene 
in these rearrangements, suggesting that MLL is 
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juxtaposed to 19 different genes. In 7 0% of these cases, 
two rearranged bands, corresponding to the two derivative 
chromosomes, were detected and in 30%, only one 
rearranged band was present. In cases with only one 
5 rearranged band, it was always detected by only the 

centromeric probe. Thus, the sequences centromeric to 
the breakpoint are always preserved, whereas, telomeric 
sequences are deleted in 3 0% of cases. 

10 It can be clearly seen that the 0.7 kilobase cDNA 

probe of the present invention detects rearrangements on 
Southern blot analysis with a single BamHl restriction 
digest in all patients with the common llq2 3 
translocations. The same breakpoint occurs in at least 

15 14. other llq23 anomalies. The breaks were all found to 
occur in a 9 kilobase breakpoint cluster region within 
the MLL gene later shown, by sequencing, to be an 8 . 3 kb 
region. The present inventors have, therefore, developed 
specific probes that can distinguish between the two 

2 0 derivative chromosomes. In cases with only one 

rearranged band, the exon sequences immediately distal to 
the breakpoint are deleted. This cDNA probe will be very 
useful clinically both in diagnosis of rearrangements of 
the MLL gene as well as in monitoring patients during the 
25 course of their disease. 

The following examples are included to demonstrate 
preferred embodiments of the invention. It should be 
appreciated by those of skill in the art that the 

3 0 techniques disclosed in the examples which follow 

represent techniques discovered by the inventor to 
function well in the practice of the invention, and thus 
can be considered to constitute preferred modes for its 
practice. However, those of skill in the art should, in 
3 5 light of the present disclosure, appreciate that many 

changes can be made in the specific embodiments which are 
disclosed and still obtain a like or similar result 
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without departing from the spirit and scope of the 
invention* 

5 EXAMPLE I 

Cloning of cDNAs of the MLL Gene that Detect DNA 
Rearrangements and Altered RNA Transcripts in 
Human Leukemic cells with llq2 3 Translocations 

10 1. Materials and Methods 

CELL LINES AND PATIENT MATERIAL. The 
characterization of the cell lines RS4;11, RCH-ADD (an 
EBV transformed cell line with a normal karyotype from a 

15 patient with leukemia and a t(l;19)), SUP-T13, U937 and 
RC-K8 have been described (Stong & Kersey, 1985; Jack et 
al . , 1986; Smith et al . , 1989; Kubonoshi et al. f 1986; 
Sundstrom & Nilsson, 1976) „ The clinical and cytogenetic 
characteristics of the patient material and cell lines 

20 with llq23 translocations are listed in Table 1. 
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PREPARATION AND SCREENING OF A cDNA LIBRARY . 
Poly (A) 4 * RNA was isolated from a monocytic cell line 
(U937) using the Fast Track Isolation mRNA Kit 
(Invitrogen) , and a custom random primed and oligo-d(T) 
5 primed cDNA library was made by Stratagene. A cDNA 

library with a titre of 1.4 xlO 6 pfu/ml cloned into the 
EcoRI site of Lambda Zap II was obtained. One half 
million plaques were plated and hybridized separately 
with two 32 P labelled probes, a 2.1 kb BaraHI/SstI fragment 

10 from the telomeric end of genomic clone 14 (Ziemin-van 
Der Poel et al . , 1991) referred to as 14BS and a 0.8 kb 
PstI fragment from the centromeric end, 14P (Fig. 1) . 
Labeling and hybridization protocols were as previously 
described (Shima et al . , 1986). Positive clones were 

15 purified and subcloned into the Bluescript vector using 
the in vivo plasmid excision protocol (Stratagene) . 
Clones were characterized by Southern blot hybridization 
and were subsequently mapped and sequenced using the 
Sequenase Kit (United States Biochemical) . 

20 

NORTHERN AND SOUTHERN ANALYSES . DNA was extracted 
from both cell lines and from patient material. Ten 
micrograms of each sample was digested with restriction 
enzymes, separated on agarose gels and transferred to 

25 nylon membranes. Poly (A) + RNA was extracted from 100 x 
10 6 cells in logarithmic or stationary growth phase using 
the Fast Track Isolation Kit (Invitrogen) . Five 
micrograms of formamide/ formaldehyde denatured RNA was 
electrophoresed on a 0.8% agarose gel at 4 0 volts/ cm for 

3 0 16 or 2 0 hours and transferred to nylon membranes. 

Hybridization and labeling protocols were as described 
previously (Shima et al . , 198 6). 
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2 . Results 

cDNA Clones 

Using a non-repetitive sequence called 14BS (2.1 kb) 
5 (Fig. 1) from the teloiaeric end of genomic clone 14 

(Ziemin-van Der Poel et al . , 1991), the present inventors 
detected two cDNA clones 14-7 (1.3 kb) and 14-9 (1.4 kb) . 
Mapping and sequencing of these two clones, revealed 
approximately 0.5 kb of homology, and clone 14-9 

10 contained a long stretch of Alu repeats. Clone 14-7 had 
an open reading frame (ORF) , that extended for the entire 
insert length with a predicted direction of transcription 
of MLL from centromere to telomere. Using a unique 
centromeric fragment, 14P (0.8 kb) , of clone 14, three 

15 additional cDNA clones were obtained; namely 14P-18A 
(1.1 kb) , 14P-18B (4.1 kb) and 14P-18C (2.0 kb) . The 
relationship of all these clones is clearly set forth in 
Fig. 1. The organization of the genomic segment is shown 
in Fig. 9 and the entire 8.3 kb genomic region is 

20 represented by seq id no: 6. cDNA clone 14P-18B (seq id 
no: 4) differs from the published sequence of Tkachuk et 
al . (1992) in that clone 14P-18B lacks exon 8 sequences. 

25 Sequence analyses indicated that the cDNA clone 14P- 

18A is completely contained in 14P-18B, while the region 
of homology of 14P-18B with 14P-18C is only 0.2 kb. As 
is the case with clone 14-9, 14P-18C also contains 
stretches of Alu repeats. All of the cDNA clones were 

3 0 hybridized to Southern blots with genomic DNA digested 

with a range of restriction enzymes and Fig. 1 shows the 
alignment of the BamHl sites in the cDNA clones to 
approximately 50 kb of genomic sequence. The genomic 
BamHl sites are the same as those reported by Cimino et 

3 5 al (1992) for this same gene which they term ALJL-I. The 
Sail and Sstl sites in the cDNA clones and the genomic 
sequence were related by hybridization to Southern blots 
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of the BamHIl 14 kb genomic fragment. Aligning clone 14- 
7 with clone 14P-18B indicates that this is an almost 
continuous cDNA sequence of 5.4 kb of the MLL gene. 

5 Southern Analyses 

Southern blots of DNA from control samples , cell 
lines and patient material with llq23 translocations were 
hybridized to an internal 0.7 kb BamHI fragment of 14P- 
18B termed MLL 0.7B, and subsequently referred to as 0.7B 

10 (Fig. 2) . This probe detects a 9 kb BajnHI germ line 

band, and also detects DNA rearrangements in samples with 
a t(4;ll), t(6;ll), t(9;ll), and t(ll;19) tested to date 
(Fig, 3 and Example II) • In most of the samples tested, 
this probe detected two rearranged bands indicating 

15 hybridization to both derivative chromosomes. In the 
cell line SUP-T13 which has a t(ll;19) this 0.7B probe 
hybridized very weakly to at least two rearranged bands 
suggesting a deletion which includes DNA sequences 
homologous to the probe (Fig. 3, lane 6). In the RC-K8 

20 cell line with a t(ll;14) (Fig. 3, lane 8), no 
rearrangement was detected. 

Northern Analyses 

To determine the nature of the transcripts detected 

2 5 by the cloned cDNAs, sequential hybridizations to the 

same Northern blots were performed. The cDNA clones used 
were 14-7, and three adjacent fragments of the cDNA clone 
14P-18B, namely a 0.3 kb BamHl/EcoRl fragment termed MLL 
0.3BE (0.3BE), a 0.7 kb Ba;nHl fragment (MLL 0.7B, or 

30 0.7B), and a 1.5 kb EcoRl/BamHl fragment termed MLL 1.5EB 
or 1.5EB (Fig. 2). These fragments are cDNAs that are 
telomeric, span and are centromeric to the breakpoint 
junction, respectively. It should be noted that the 
-EcoRl site used to excise the 1.5 kb fragment was a 

35 cloning site. 
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Th e most telomeric cDNA clone 14-7, detected two 
large transcripts of 12-0 and 11.5 kb in normal cell 
lines (EBV immortalized B cells) and in the cell line RC- 
K8 (Fig. 4A panel a). However, in the RS4;11 cell line 
5 three transcripts of estimated sizes 12.0, 11.5 and 

11.0 kb were evident (Fig. 4B panel a). There was only 
weak hybridization to the normal 12.0 and 11.0 kb message 
in the latter sample, while the 11.5 kb transcript was 
expressed in high abundance (Fig. 4a where actin is used 
10 as a control probe). The ratio of expression of the 11.5 
and 11.0 kb transcripts in the RS4;11 cell line was 
dependent upon the state of cell growth when RNA was 
extracted, (compare Figs. 4A panel a, and 4B panel a) . 

On separate hybridizations with all three of these 
fragments (0.3BE, 0.7B and 1.5EB) of clone 14P-18B, the 
estimated 12.0 and 11.5 kb transcripts were detected in 
normal cell lines (Fig. 4A, panel a-c) . The 0.3BE probe 
also detected a normal 2.0 kb transcript which was 
expressed in all cell lines tested so far. In monocytoid 
cell lines the 0.3BE probe detected an additional 
transcript of 5.0 kb. In addition to hybridization to 
the estimated 12.0 and 11.5 kb transcripts in normal cell 
lines, the most centromeric 1.5EB probe detected the 
large 12.5 kb transcript, which the present inventors 
have described as a MLL transcript that spans the 
breakpoint (Ziemin-van Der Poel et al . , 1991). 

It is important to stress that the size 
3 0 determination of larger sized nucleic acids using 

Northern blotting is not always completely accurate. In 
the size range of about 9-10 kb, and above, it is known 
that the poorer resolution of agarose gels can lead to 
the over- or under-estimation of transcript size. Such 
3 5 determinations may even differ by up to about 2 kb or so. 
Therefore, it will be understood that all references to 
size determinations in the results and discussions which 
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follow are the currently best available estimate of the 
transcript size, and may not precisely correlate with the 
size determined by other means, such as, for example, by 
direct sequencing. 

5 

In the RS4;11 cell line, there was evidence of 
differential hybridization of these probes to 
transcripts. Figure 4B shows a Northern blot with RNA 
from the RS4;11 cell line electrophoresed for 20 hours to 

10 obtain better resolution of the large size transcripts. 
The 0.3BE probe hybridized very strongly to the over- 
expressed rearranged 11.5 kb and the 11. 0 kb transcripts 
with weak hybridization to a transcript of 12.0 kb. 
There was also hybridization to the two smaller normal 

15 transcripts of 5.0 and a 2.0 kb (Fig. 4B panel b) . The 
adjacent 0.7B probe which detected DNA rearrangements in 
cells with llq23 translocations, hybridized to the over- 
expressed 11.5 kb and 11.0 kb rearranged transcripts with 
weak hybridization to the normal 12.0 kb transcript as 

2 0 above. However, this 0.7B probe also detected a 

rearranged mRNA transcript estimated to be 11.25 kb (Fig. 
4B panel c) in these cells with a t(4;ll). Finally, the 
1.5EB probe which is centromeric to the breakpoint: 
junction also detected this rearranged 11.25 kb 

2 5 transcript with weak hybridization to the normal 12.5, 

12.0 and 11.5 kb transcripts (Fig. 4B panel d) . Of 
notable exception, this 1.5EB probe did not detect the 
over-expressed 11.5 kb transcript and the 11.0 kb 
transcript in the RS4;11 cell line. The detection of 

3 0 different mRNA transcripts by these probes is summarized 

in Table 2, and also represented graphically in Figure 5. 
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3. Discussion 

The inventors have isolated several cDNA clones of 
the MLL gene of which the internal 0.7 kb BamHl fragment 
5 of cDNA clone 14P-18B (0.7B) detected rearrangements in 
leukemic samples with the centromeric llq2 3 translocation 
(Fig. 3 and Example II) . The data presented herein 
indicate that the breakpoints in band llq2 3 in the common 
translocations which involve chromosomes 4, 6, 9 and 19 

10 are clustered within an 8.3 kb region of the MLL gene. 
In many of the samples, this probe detected two 
rearranged bands indicating hybridization to both 
derivative chromosomes. This implies that this 0.7B 
fragment contains DNA sequences from both ends of the 

15 9 kb BamHl genomic fragment, see also Example II. 

DNA rearrangements were not detected in the RC-K8 
cell line which has a t ( 11 ; 14 ) (q23 ;q32 ) , which further 
confirms the existence of at least two distinct 

20 breakpoint regions in llq23 (Rowley et al . , 1990; Akao et 
al . , 1991b; Lu & Yunis, 1992; Radice & Tunnacliffe, 
1992) . One is the more centromeric region and involves 
the MLL gene; whereas the other is at least 110 kb 
telomeric and includes the breakpoint seen in the RC-K8 

25 cell line (Akao et al . , 1991b; Lu & Yunis, 1992; Radice & 
Tunnacliffe, 1992) . Furthermore Lu and Yunis have 
determined that the 5' non coding region of the p54 gene 
is split in this more telomeric llq23 translocation, 
which indicates that the p54 gene is different from MLL. 

30 

Figure 1 shows the alignment of the cDNAs to genomic 
sequences which span approximately 50 kb. The largest 
cDNA, 14P-18B is 4.1 kb, and it is located centromeric to 
clone 14-7 to give 5.4 kb of almost continuous cDNA 
3 5 sequence. The inventors have therefore cloned more than 
one third of the 11.0, 11.5, 12.0 and 12.5 kb transcripts 
of the MLL gene. Two other cDNAs, 14P-18C and 14-9, 
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contain Alu . repetitive sequences and share limited 
homology with 14P-18B and 14-7 respectively (Fig. 1) . 
This indicates that these cDNAs are derived either from 
different transcripts or are derived from incompletely 
5 processed transcripts. It is now known that virtually 
all 12.5 to 15.0 kb of the MLL gene is an open reading 
frame and that there is homology between MLL and the zinc 
finger region of the Drosophila trithorax gene (Tkachuk 
et al., 11992; Gu et al., 1992a). 

10 

Use of fragments of the cDNA clones in Northern 
hybridizations provided evidence of a range of MLL 
transcript sizes in different hematopoietic lineages as 
well as of alternative exon splicing of the MLL gene 
15 transcripts. The normal transcripts, estimated to be 

2.0, 11.5, 12.0 and 12.5 kb in length, are expressed in 
both hematopoietic and non-hematopoietic tissues. The 
5.0 kb transcript is detected in monocytic cell lines and 
in the T-cell line tested. The level of expression of 

2 0 the 5.0 kb transcript in the RS(4;11) cell line is 

approximately 50% of that expressed in the monocytic cell 
lines. This result may reflect the biphenotypic nature 
of this cell line which has both pre-B-cell and 
monocytoid features. 

25 

Northern blot analyses using the 14-7 probe (which 
is telomeric to the breakpoint region) detected the two 
large transcripts of 12.0 and 11.5 kb in control B cells 
and in the RC-K8 cell line. In the RS4;11 cell line, 

3 0 this probe detected a weak signal at 12.0 kb with strong 

hybridization to an 11.5 kb transcript. This probe also 
detected an additional smaller transcript of 11.0 kb in 
the RS4;11 cell line (Fig. 43 panel a). The 12.0 and 
11.0 kb transcripts appear to be in low abundance while 
35 the 11.5 kb transcript is over-expressed. The relative 
ratio of hybridization of the estimated 11.5 and 11.0 kb 
rearranged mRNA transcripts varies with the growth phase 
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of the RS4;11 cells prior to RNA extraction. In 
logarithmic growth phase, the ratio of the two signals is 
approximately 3:1, whereas in stationary phase, the 
11.0 kb transcript is hardly discernible (Figs. 4A and 
5 4B, panel a) . 

To define more precisely the nature of the 
transcripts detected in control cell lines and in the 
cell line with the t(4;ll), three adjacent fragments of 

10 clone 14P-18B (Fig. 2) were hybridized sequentially to 
the same Northern blots (Fig. 4A,4B). All of the probes 
detected the 12.0 and 11.5 kb transcripts in normal 
cells. The most centromeric 1.5EB probe also detected a 
12 . 5 kb transcript on very long exposure of 

15 autoradiograms. These three transcripts are normal MLL 

transcripts which cross the llq23 breakpoint region. - The 
fact that the 1.5EB probe is the only fragment of the 
4.1 kb 14P-18B cDNA clone that detects the large 12.5 kb 
transcript indicates the existence of alternative exon 

20 splicing. To date, the only other cDNA clones which 

detect this transcript are 14-9 and 14P-18C. These cDNA 
clones contain Alu repeats, which might indicate the; 
presence of intron sequences in incompletely processed 
MLL transcripts. 

25 

On sequential hybridization of these three fragments 
to Northern blots of RNA from the RS4;11 cell line there 
was evidence of weak hybridization to the normal 12.5, 
12.0 and 11.5 kb transcripts, all of which cross the 

30 breakpoint (Fig. 4A,4B). The present inventors now have 
evidence that the over-expressed 11.5 kb transcript in 
the RS4;11 cell line is not the same as the normal 
11.5 kb transcript. The 1.5EB probe detects the normal 
11.5 kb transcript in control cells, however there is 

35 only a weak hybridization signal to an 11.5 kb transcript 
in the RS4;11 cell line (Fig. 4A, panel c) . This weak 
hybridization is proposed to be detection of the normal 
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11.5 kb transcript, and is a different transcript from 
the over-expressed 11.5 kb transcript which is detected 
with all the other more telomeric probes. These data 
indicate that the weakly hybridizing 11.5 kb transcript 
5 detected by the 1.5EB probe, is one of the three normal 
12.5, 12.0 and 11.5 kb MLL transcripts that cross the 
breakpoint. The reduced expression of all these three 
transcripts in the RS4;11 cell line may be due to 
transcription from only the normal chromosome 11. 
10 Therefore, the over-expressed 11.5 kb transcript which 

was detected with the more telomeric probes is an altered 
MLL transcript derived from the der(4) chromosome (Fig. 
4B panel a-c) . 

15 There was evidence of two other altered MLL 

transcripts of 11.25 and 11.0 kb in the RS4;11 cell line. 
The origin of these two transcripts was easier to define 
as there was no hybridization to transcripts of these 
sizes in RNA from normal cells. The 11.25 kb transcript 

2 0 was detected with the centromeric 1.5EB probe and the 

0.7B probe that contains sequences that span the 
breakpoint, and thus suggests that it originates in the 
der(ll) chromosome (Fig. 4B panel c,d). The 11.0 kb 
transcript was detected with the same three probes (14-7, 
25 0.3BE and 0.7B) as the aberrant 11.5 kb transcript and is 
probably derived from the der(4) chromosome (Fig. 4B 
panel a-c) according to the scheme in Fig. 5. Thus the 
inventors have developed cDNA probes for the MLL gene 
which permit detection of three altered transcripts of 

3 0 MLL arising from both derivative chromosomes in a cell 

line with a t (4; 11) . 

In recent reports by Croce and colleagues (Cimino et 
al . 1991; 1992; Gu et al . 1992a) a genomic clone which 
3 5 was 10 kb centromeric to the breakpoint region, detected 
a major transcript said to be about 12.5 kb and a minor 
11.5 kb transcript with additional hybridization to an 
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11.0 kb species which was only found in cell lines with a 
t(4;ll). This 11.0 kb transcript may be the same as the 
altered 11.25 kb MLL transcript detected in the RS4;11 
cell line using the 0.7B and 1.5EB cDNA probes. The 
5 inventors propose that this transcript is from the 

der(ll) chromosome. The discrepancy in size between the 
transcript detected in this study and that of Cimino et 
al may be due to poor resolution of transcripts of this 
large size. Using the centromeric genomic probe, k Cimino 
10 et al . (1992) also reported hybridization to 0.4 and 

5.0 kb transcripts in a variety of cell lines which were 
not found in the present study. 

In summary the cDNA and Northern analyses indicate 
15 that the MLL gene is a large complex gene with numerous 
transcript sizes. In analyses of the transcripts in the 
RS4;11 cell line, the inventors found that there is 
reduced expression of the normal MLL transcripts of 12.5, 
12.0 and 11.5 kb, and that (Heim & Mitelman, 1987) the 

2 0 over-expressed 11.5 kb transcript and the 11.0 kb 

transcript as well as the 11.25 kb transcript specific to 
the RS4;11 cell line are altered MLL transcripts arising 
from the translocation derivative 4 and derivative 11 
chromosomes respectively. How, or if, these three 
25 altered transcripts of the MLL gene alter normal MLL 
protein expression and function and contribute to 
leukemogenesis is still unknown. 

A major question in reciprocal translocations is 

3 0 which derivative chromosome contains the critical 

junction. Analysis of complex translocations indicate 
that, for these llq23 translocations, it is the der(ll) 
chromosome. The Southern blot analysis of patient data, 
as presented in Example II, supports this interpretation. 
3 5 Because the direction of transcription of MLL is from 
centromere to telomere, the juxtaposition of the 5' 
sequences and the 5' flanking regulatory regions of MLL 
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remaining on the der(ll) to various other genes on other 
chromosomes may play an important role in all of these 
leukemias. The fact that this translocation is 
associated with lymphoid and myeloid leukemias suggests 
5 that the regulated expression of the MLL gene may be 
important in normal hematopoietic lineage specificity, 
and that rearrangements of this gene play a critical role 
in the oncogenic process of these leukemias. 

10 

EXAMPLE II 

A cDNA Probe Detects All Rearrangements of the MLL Gene 
in Leukemias with Common and Rare llq23 Translocations 

15 This example concerns the identification of a 

restriction fragment from a cDNA clone which detects 
rearrangements in all cases of the t(4;ll) / t(6;ll), 
t(9;ll), and both types of t(ll;19) examined as well as 
in many rare translocations with a breakpoint at band 

20 llq23. A key feature of this fragment is that it 

contains exons that flank the breakpoints in all of these 
cases. The present inventors have thus delineated an 
8.3 kilobase breakpoint cluster region in the common and 
rare translocations involving llq23. In addition, 

25 through the use of probes amplified by the polymerase 

chain reaction (PCR) from the centromeric and telomeric 
portions of this cDNA fragment, the present invention 
provides methods and compositions for the use in 
distinguishing between the two derivative chromosomes. 

3 0 Moreover, this example provides further data to support 

the hypothesis that the derivative 11 chromosome contains, 
the critical translocation junction. 
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1. Materials and Methods 

PATIENTS AND CELLS LINES. Patient samples were 
obtained from, the University of Chicago Medical Center, 
5 Saitama Cancer Center, Southwest Biomedical Research 
Institute, and Memorial Sloan-Kettering Cancer Center. 
The samples were selected on the basis of a karyotype 
containing an llq23 abnormality and the availability of 
cryopreserved leukemic bone marrow or peripheral blood. 
10 The cell line RS4;11 was a gift from J. Kersey at the 

University of Minnesota; (Stong & Kersey, 1985) SUP-T13 
was a gift from S. Smith at the University of Chicago, 
(Smith et al . , 1989) and Karpas 45 was a gift from 
A. Karpas at Cambridge University (Karpas et al . , 1977). 

15 

CYTOGENETIC ANALYSIS. Cytogenetic analysis was 
performed using a trypsin-Giemsa banding technique. 
Chromosomal abnormalities were described according to the 
International System for Human Cytogenetic Nomenclature 
20 (Harnden & Klinger, 1985) . 

cDNA LIBRARY . A cDNA library was prepared from a 
monocytic cell line as described above in Example I*. The 
library was screened with probes from the centromeric and 
25 telomeric ends of a 14 kilobase genomic BamKZ fragment 
(clone 14) and several cDNA clones were obtained and 
mapped with restriction endonucleases. A 0.7 kilobase 
fragment called MLL 0.7B was isolated from a cDNA clone 
named 14P18C and used as described below. 

30 

MOLECULAR ANALYSIS. DNA was extracted from 
cryopreserved cells and digested with restriction 
enzymes, electrophoresed on 0.7% agarose gels, 
transferred to nylon membranes, and hybridized with 
35 radiolabeled cDNA probes at 42 °C. All DNA blots were 

washed to a final stringency of IX SSC and 1% SDS at 65 °C 
prior to autoradiography. 
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SEQUENCE ANALYSIS . Nucleotide sequences were 
obtained by the dideoxy chain termination method with a 
double stranded DNA sequencing strategy using the 
Sequenase kit (United States Biochemical, Cleveland, OH) . 

5 

POLYMERASE CHAIN REACTION (PCR) . Amplification of 
unique sequences from the 0,7 kilobase BamHI fragment, 
corresponding to exons at the centromeric and telomeric 
ends of the 9 kilobase germline fragment, was performed 

10 using standard methods. 10 ng of cDNA were amplified in 
50 Ml of reaction mix containing 1.5 mM MgCl 2 , 1.25 mM 
dNTPs, and 2 . 5 U of Taq polymerase. Reactions were 
performed in an automated thermal cycler (Perkin- 
Elmer/Cetus, Norwalk, CT) with denaturation at 92 °C for 

15 50 seconds, annealing at 50°C for 50 seconds, and 
extension at 72 °C for one minute. 

2 . Results 

2 0 The inventors isolated a 0.7 kilobase BamKI cDNA 

fragment which is composed of exons flanking the 
centromeric and telomeric ends of an 8.3 kilobase genomic 
BajnHI fragment of the MLL gene (Example I, Figs. 1 and 
2) . On Southern blot analysis, this 0.7 kilobase cDNA 

25 fragment, 0.7B, detected rearrangements of the MLL gene 

in 61 patients (58 with leukemia and three with lymphoma) 
and three cell lines (Fig. 6) . This included all 48 
cases (46 patients and two cell lines) with the common 
translocations involving llq2 3 including the 

30 t(4;ll) (q21;q23) , t (6; 11) (q27 ;q23) , t ( 9 ; 11) (p22 ; q23 ) , 
t (ll;19) (q23;pl3.1) and t (11;19) (q23 ;pl3 . 3) (Table 3). 
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Also identified by the 0.7B probe were similar MLL 
gene rearrangements in DNA from 8 patients and one cell 
line with several less common llq23 translocations listed 
in Human Genome Mapping 11 (Table 3) (Mitelman et al . , 
5 1991). These include translocations involving lp32, 
lq21, 2p21, 17q21, 17q2 5, Xql3 , and three cases with 
insertion 10; 11. In addition, 7 other llq23 anomalies 
which have not been reported as recurring abnormalities, 
including translocations involving 6pl2, lOpll, 10q22, 
10 15ql5, 18q21, and 22ql2, and one case with 

inv(ll) (ql4q23) , showed MLL rearrangements (Table 4) . 
The rearrangements detected in cell lines included RS4/11 
with a t(4;ll), SUPT13 with a t(ll;19), and Karpas 45 
with a t(X;ll) (ql3;q23) . 

15 

The 0.7B MLL probe did not detect rearrangements in 
remission samples from patients who had rearrangements in 
the DNA from their leukemia cells. In addition, 
rearrangements were not identified in a few cases with 

20 uncommon llq2 3 translocations. These included AML 

patients with a t (4 ; 11) (q23 ;q23 ) , and a t (5;11) (ql3 ;q23) , 
and an ALL with a t (10; 11) (pl3 ;q23 ) . However, and 
importantly, no patients were identified with the common 
llq2 3 translocations who failed to show rearrangements 

25 with the 0.7 kilobase cDNA fragment termed 0.7B. 

The age distribution of the leukemia patients in 
this series was broad; 11 patients were one year or less, 
16 were between the ages of two and 16, and 31 were 17 

3 0 years or older. There were 2 7 females and 31 males. The 
phenotype of "the leukemias in these patients showed 28 
with ALL and 3 0 with AML. The cases with ALL and AML 
were indistinguishable by Southern blot analysis. In 70% 
of cases, two rearranged bands, corresponding to the two 

3 5 derivative chromosomes, were detected. Only a single 
rearranged band was detected in the remaining 3 0% of 
cases (Fig. 7) . To determine whether there were any 
potential correlations with the presence of one versus 
two rearranged bands, the patients were analyzed by 
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karyotypic abnormalities, phenotype of the leukemic 
cells, and by age. No significant associations between 
the number of rearranged bands and any of these subgroups 
were found. 

5 

In addition to these acute lymphoid and myeloid 
leukemias, 20 cases of non-Hodgkin' s lymphomas were also 
examined. Rearrangements were detected in three of these 
patients. This included one patient with a follicular 

10 small c leaved-cell lymphoma who ha<i a karyotype which 

showed both a t ( 14 ; 18) (q32 ;q21) and a t(6;ll) (pl2;q23) , a 
patient with Burkitt's lymphoma whose karyotype included 
a t(8;14) (q24;q32) and an inv(ll) (ql4q23) , and a patient 
with a diffuse mixed small cleaved cell and large cell 

15 lymphoma whose karyotype also included a trisomy 21. The 
other 17 lymphomas with llq23 abnormalities, primarily 
deletions and duplications, did not show rearrangements. 

To distinguish which derivative chromosome is 

2 0 represented by each of the rearranged bands on Southern 
blot analysis, sequences from the centromeric and 
telomeric portions of the 0.7 kilobase cDNA fragment, 
0.7B, were amplified by PCR to create distinct DNA 
probes. The centromeric PCR fragment detected the 

25 germline band and only one of the rearranged bands on 
Southern blot analysis. Thus, the rearranged band 
detected with this probe corresponds to the derivative 11 
[der(ll)] chromosome. The fragment amplified by PCR from 
the portion of the 0.7 kilobase cDNA fragment telomeric 

30 to the breakpoint was also hybridized to the same blots. 
The telomeric probe identified the germline band as well 
as the derivative chromosome of the other translocation 
partner. Clearly in cases with two rearranged bands, 
both derivative chromosor.es are present. However, in the 

35 cases in which only one rearranged band is detected, it 

consistently is identified only by the centromeric probe. 
Therefore, the sequences immediately centromeric to the 
breakpoint are always preserved but the sequences distal 
to the breakpoint appear to be deleted in 30% of cases. 
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In two of the patients (both Japanese) analyzed, a 
different pattern of hybridization was noted with the 
three probes employed. In one patient with a t(l;ll) and 
another with a t(4;ll), the 0,7 kilobase cDNA probe and 
5 the centromeric PCR probe both identified the same two 
rearranged bands (Fig. 8). In all other cases, the^ 
centromeric PCR probe recognized only one of the two 
rearranged bands. In these two patients as in all other 
cases, the telomeric PCR probe detected only one of the 

10 two rearranged bands. Presumably, these breaks differed 
from the remainder of cases that were examined. Clearly, 
a portion of the exon sequences in these two patients, 
which in all other cases remains on the der(ll) , is 
translocated to the other derivative chromosome. The 

15 breaks may occur either within one or more exons on the 
centromeric side of the 8.3 kilobase genomic fragment or 
alternatively, if more than one exon is present, the 
breaks may occur within an intron separating these exons. 
Further analysis of the exon\ intron boundaries within the 

20 8.3 kilobase genomic BamHl fragment will allow the 
determination of the precise localization of these 
breakpoints . 

3 . Discussion 

25 

The present inventors have identified DNA 
rearrangements in 61 patients and three cell lines with 
llq23 abnormalities that affect the MLL gene and have 
delineated an 8.3 kilobase breakpoint cluster region 

30 within this gene using a 0.7 kilobase BamHl cDNA fragment 
(seq id no:l) as a probe. Rearrangements have been 
detected in all 48 cases examined with the t(4;ll), 
t(6;ll), t(9;ll), and both types of t(ll:19) as well as 
in 12 rare translocations, three insertions, and one 

35 inversion involving llq23. Rearrangements were also 

detected in three patients with non-Hodgkins lymphoma. 
These are the first cases of lymphoma that have been 
found to share the same breakpoint as the leukemias with 
llq2 3 translocations. While rearrangements are 
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detectable with multiple restriction enzymes, digestion 
with only a single enzyme, BajnHI, was sufficient to 
identify each case with a rearrangement. In 70% of these 
cases, two rearranged bands, corresponding to the two 
5 derivative chromosomes, were identified and in 30%, only 
one band was present which we showed was derived from the 
der (11) chromosome. 

The present study using the novel probes described 
10 above, particularly the 0.7 kb BajnHI fragment, gave 
significantly improved results over all previously 
reported studies. For example, Cimino et al . described 
the identification of a 0.7 kb Ddel genomic fragment that 
detected rearrangements in a 5.8 kilobase region in 6 of 
15 7 patients with the t(4;ll), 4 of 5 with t(9;ll), and 3 
of 4 with the t(ll;19) (Cimino et al . , 1991). In three 
of these 16 patients, two rearranged bands were detected 
and in the remainder, only one rearranged band was 
identified. Subsequently, they reported on an additional 
20 14 patients with this probe (Cimino et al . , 1992). In 

their combined series, this probe detected rearrangements 
in 26 of 30 cases (87%) with the t(4;ll), t(9;ll), and 
t(ll;19). They hypothesize that the breaks in the 4 
cases that were not identified with their probe occur 

2 5 either at another site within this gene or at other loci 

in llq23. Assuming that the true incidence of 
rearrangements within the breakpoint cluster region in 
patients with the 5 common llq23 translocations is 87%, 
then the likelihood, calculated by binomial 

3 0 probabilities, of identifying rearrangements in 48 of 4 8 

consecutive cases is 0.0014. Thus, the failure to detect 
rearrangements in those 4 cases by Cimino and colleagues 
is likely due to the separation of these breaks from the 
genomic Ddel probe by a Ddel restriction site. 

35 

Importantly, whereas the breakpoint in many cases 
with llq23 translocations may be contained within a 5.8 
kilobase genomic fragment, the breakpoint cluster region 
of the present invention encompasses a larger region of 
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8.3 kilobases and contains the breakpoints in all 
leukemia cases with the common translocations, as well as 
in all except three of the rare translocations examined. 

5 Pulsed field gel electrophoresis (PFGE) and ^ 

fluorescence in situ hybridization (FISH) both have been 
used to map the region containing the llq2 3 breakpoints 
in leukemias (Savage et al . , 1988; 1991; Yunis et al . , 
1989; Tunnacliffe & McGuire, 1990) . With FISH, the 

10 breakpoint lies telomeric to the CD3G gene and 

% centromeric to the PBGD gene (Rowley et al . , 1990). With 
(PFGE) , the distance between the CD3G gene and the 
breakpoint in the t(4;ll) has been narrowed to 100-200 
kilobases (Das et al . , 1991). Chen et al . (1991) have 

15 shown by PFGE that there is a clustering of breakpoints 
in eight cases with the t(4;ll) and in two other patient 
samples with llq2 3 translocations but the size and- 
location of this region could not be determined 
precisely. 

20 

Whereas the data presented herein and that of- Cimino 
et al . (1991; 1992) indicate a clustering of breakpoints, 
several studies have suggested that the breakpoints on 
llq23 may be heterogeneous. Using cosraid probes and 

25 FISH, Cherif et al . (1992) found that one of their probes 
was proximal to the breakpoint in the t(ll;19) and distal 
to those in the t(4;ll), t(6;ll), and t(9;ll). Cotter et 
al . (1991) using PCR amplification of microdissected 
material from llq2 3 reported that the breaks in two 

3 0 t(6;ll) cases were proximal to the CD3D gene and that the 
breakpoints in the t(4;ll) and t(9;ll) were distal to 
this gene. 

Molecular studies have confirmed that the 
3 5 breakpoints in translocations involving the antigen 
receptor loci on chromosome 14 differ from the llq23 
translocations just discussed. Studies on the RCK8 B- 
cell lymphoma line which has a t(ll;14) (q23;q32) showed 
that the immunoglobulin heavy chain constant region gene 
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and a gene called RCK were involved in the translocation 
(Akao et al . , 1990;1991a) . Mapping data indicate that 
RCK is over 100 kilobases telomeric to MLL (Radice & 
Tunnacliffe, 1992) . In addition, the present inventors 
5 cloned a t(ll;14) (q23;qll) from a patient with a null- 
cell ALL and identified rearrangements of the T cell 
receptor alpha/delta locus. DNA probes from this llq2 3 
breakpoint failed to show rearrangements in leukemias 
with the common llq2 3 translocations* Mapping data 

10 indicate that this breakpoint is approximately 700 
kilobases telomeric to MLL. Therefore, band llq^3 
contains breakpoints for at least three different cancer- 
related translocations. However, the data presented 
herein establish a tight clustering of breakpoints in the 

15 MLL gene which is centromeric to RCK and the other 
t(ll;14) breakpoints previously described by the 
inventors. 

In reciprocal translocations, the identification of 
2 0 the derivative chromosome containing the critical 

junction is essential. Based on data from Southern blot 
analysis, FISH, and cytogenetic analysis of complex 
translocations, the inventors propose that the der(ll) 
contains the critical junction. At the molecular level, 

2 5 the Southern blot analyses show a consistent pattern that 

indicates that the 5' portion of the exon sequences 
centromeric to the breakpoint on the der(ll) are always 
conserved. In those cases in which the 0.7 kilobase cDNA 
fragment identifies one rearranged band, it is always 

3 0 detected by only the centromeric PCR probe. Thus, exon 

sequences from the centromeric portion of the 8.3 
kilobase BajnHI genomic fragment are always preserved on 
the der(ll) but the exon sequences from the telomeric 
portion of this genomic fragment can be deleted in the 
3 5 formation of the translocation. 



Previously, the inventors identified a patient with 
a t(9;ll) who was found to have a deletion by FISH of a 
series of probes spanning several hundred kilobases 
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telomeric to the breakpoint on llq23 (Rowley et al . , 
1990). On Southern blot analysis of this patient's DNA, 
only one rearranged band was identified and thus the exon 
telomeric to the breakpoint was deleted. Recently, using 
5 FISH, the present inventors also found that a phage clone 
containing a large portion of the 14 kilobase genomic 
BamHl fragment immediately telomeric to the 8 . 3 kilobase 
breakpoint cluster region was also deleted in this 
patient. This 14 kilobase genomic BamHl fragment 

10 contains an open reading frame of MLL. Presumably, all 
of the coding sequences distal to the breakpoint are 
deleted in this patient. In addition, another patient 
with a t(6;ll) was also found to have one rearranged band 
on Southern analysis and a deletion of this same phage 

15 clone by FISH. Thus in several patients, deletions begin 
within the breakpoint cluster region and extend distally 
to include the region containing coding sequences of the 
gene. 

2 0 The molecular and FISH data indicating that the 

der(ll) chromosome contains the critical junction are - 
supported by an analysis of complex translocations that 
involve three chromosomes. For example, in a - 
t (4;ll;17) (q21;q23;qll) , the movement of the 4q to llq 

25 {the der(ll)} is conserved whereas the llq is 

translocated to the derivative 17 chromosome. An 
analogous pattern has been identified in 13 cases of 
complex translocations. Based on the data of the present 
invention, the following model is proposed. As a result 

30 of the translocation, sequences on the der(ll) are joined 
to a large number of other chromosomal breakpoint 
regions, 19 detected in the inventors' laboratories 
alone. Presumably, the 5' sequences of the MLL gene are 
thus juxtaposed to 3' sequences from genes located on the 

3 5 other translocation partners. The present invention 

provides the molecular tools to allow the functional 
consequences of these translocations to be determined. 
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The present inventors have delineated a breakpoint 
cluster region in the MLL gene and have identified 
rearrangements in a total of 19 different translocations, 
insertions, and inversions involving llq23. The 0,7 
5 kilobase cDNA probe of the present invention, and its 
derivative centromeric and telomeric PCR probes, are 
proposed to be broadly applicable to clinical diagnosis, 
particularly as they detect all of the rearrangements in 
DNA digested with a single enzyme (BamHl) . This is 

10 envisioned to be useful in the rapid detection of 
leukemia in both children and adults and will be 
especially important in leukemic infants under one year 
of age in whom the single most common chromosomal 
abnormality is a translocation involving llq23. In 

15 addition, it is contemplated that this probe will be 

effective for monitoring response to chemotherapy and for 
evaluation of minimal residual disease following 
treatment . These probes will be essential in cloning the 
breakpoints of leukemias which involve the MLL locus and 

20 in further molecular analysis of these translocations. 



EXAMPLE III 

Sequencing of the 8.3 kilobase Genomic BamHl Fragment 
2 5 that 

contains All of the Common MLL Translocation Breakpoints. 

The inventors have recently obtained the DNA 
sequence for the 8.3 kb genomic BamHl fragment which 
30 contains all of the common translocation breakpoints. 

This sequence is provided in the present application as 
seq id no: 6. 

The inventors envision using this new sequence 
35 information to map the intron-exon boundaries within this 
region and to identify the specific nucleotides involved 
in the breakpoint junctions in various patients. 
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EXAMPLE IV 

Expression of MLL-Derived Proteins and Anti-MLL 
Antibodies 



5 1. Production of Antisera to a Region of MLL Telomeric 
to the Breakpoint Region (MLL Amino Acids of Sea Id 
No; 8) 



To express MLL amino acids of seq id no: 8 
10 (corresponding to MLL amino acids 2772-3209 of Tkachuk et 
al • f 1992), plasmid 14-7 was digested with EcoRl and the 
insert was ligated into plasmid pGEX-KG digested with 
EcoRl, resulting in the 1.3 kb MLL fragment inserted in 
frame into the expression vector. This construct 
15 produces an MLL amino acid-containing fusion protein with 
GST (glutathione-S-transf erase) . This DNA was 
transformed into JM101 bacteria. To produce large 
quantities of the MLL protein corresponding to seq id 
no: 8 for production of rabbit antisera, the plasmid- 
2 0 transformed bacteria were grown in LB medium and induced 
to express the fusion protein with IPTG. 

This fusion protein was purified using glutathione- 
agarose affinity chromatography, followed by preparative 

25 SDS-polyacrylamide gel electrophoresis. The fusion 

protein was then electroeluted from the gel and used to 
immunize rabbits in order to generate specific antisera 
(performed by Josman Laboratories, Napa, CA) . The rabbit 
antisera produced against the MLL protein corresponding 

30 to seq id no: 8 has a very high titer by western blotting 
and reacts specifically with the MLL portion of the 
fusion protein (Fig. 10) . 



2 • Production of Antisera to a Region of MLL 
3 5 Centromeric to the Breakpoint Region (MLL Amino 

Acids 323-623 from Seq Id Not 7) 

Specific MLL oligonucleotides with Smal restriction 
enzyme sites were used as PCR primers to amplify MLL 
40 amino acids 323-623 from seq id no: 7 using the plasmid 

14P18B as template. This amplified DNA was digested with 
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Smal and ligated into plasmid pGEX-KT (an improved 
version of plasmid pGEX-KG used above) that had been 
digested with Smal. This results in MLL amino acids 323- 
623 (representing MLL amino acids 1101-1400 of Tkachuk et 
5 al., 1992), corresponding to the proline-rich region, 

being inserted in-frame into the expression vector- This 
DNA was transformed into BL21 bacteria. Large amounts of 
this fusion protein can be produced using this 
methodology and employed in the production of specific 
10 antisera, for example, using rabbits. 



Such antibodies may be employed as part of the 
ongoing studies directed to the MLL protein. For 
example, they may employed to determine the MLL protein 
localization within the cell, or to determine whether 
this protein binds to DNA. The generation of monoclonal 
antibodies has also been made possible by the present 
invention. 

EXAMPLE V 
Expression of Various MLL Domains 

The MLL zinc finger regions (corresponding to amino 
acids 1350-1700, 1700-2000, and 1350-2000 of Tkachuk et 
al . , 1992) have been cloned into the pGEX-KT expression 
vector as described above. In addition, the inventors 
propose to clone various of the MLL protein coding 
regions into the expression vector pSg24 in pieces 
ranging from 3 00-650 amino acids to allow the functional 
definition of the MLL protein. 
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EXAMPLE VI 

Detection of MLL Gene Rearrangements in Karpas 45 
Leukemic Cells with a t (X; 11) (q!3 :q23) Translocation 

5 This example concerns the detection and 

characterization of aberrant MLL transcripts in Karpas 4 5 
leukemic cells with a t (X; 11) (ql3 ;q23) translocation and 
provides further evidence of the utility of the present 
probes in detecting leukemic cells with different 
10 breakpoints. 

In this analysis of the Karpas 45 cell line (Karpas 
et al., 1977), known to have a t(X;ll) (ql3;q23) 
translocation (Kearney et al . , 1992), the inventors show 
15 the MLL gene to be rearranged and demonstrate the 1 
presence of two altered MLL transcripts which come from 
the der(ll) chromosome, MLL was also found to be 
rearranged using Southern blot analyses of DNA from 
Karpas 45. 

20 

1. Materials and Methods 

The T-cell line Karpas 45, established from a 
patient with a T-cell ALL, was obtained from A. Karpas 

25 (University of Cambridge, England, Karpas et al . , 1977). 
Karpas 45 has been shown, by fluorescence in situ 
hybridization, to have a t(X,ll (ql3;q23), which involves 
rearrangement of the MLL gene. The cell lines RC-K8 and 
RCH-ADD, which do not have chromosomal translocations 

30 that involve MLL have been described previously (Ziemin- 
van Der Poel et al . , 1991) and were used as controls. 

The cDNA probe 14P-18B has been described herein in 
the previous examples. The cDNA clone was digested with 

3 5 EcoRl and BamHl to give three fragments for use in 

Northern and Southern blot hybridizations. The 0.7B 
probe, which spans the breakpoint, and the 1.5EB probe, 
centromeric to the breakpoint, have been described 
hereinabove. A further 0.8 kb £coRl fragment, which is 

4 0 telomeric to the breakpoint was obtained and used in this 
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study, this probe is termed 0.8E. It should be noted 
that the EcoRl site used to excise the 1.5EB fragment was 
a cloning site. 

5 DNA was extracted from the Karpas 4 5 cell line and 

normal human placenta, digested with the restriction 
enzyme BamHl and electrophoresed on a 1% agarose gel. 
Poly A + RNA was isolated from the cell lines Karpas 45, 
RC-K8 and RCH-ADD using the Fast Track Isolation Kit 
10 (Invitrogen) and 5 j*g were electrophoresed on a 0 V . 8% 

formaldehyde gel as described hereinabove. Radioactive 
labeling of cDNA fragments, hybridization and washing 
conditions were as described in the previous examples. 

15 2. Results and Discussion 

To determine if MLL was rearranged in the Karpas 4 5 
cell, known to have an llq2 3 translocation, a Southern 
blot with BamHl digested DNA was hybridized to the 0.7B 
20 probe. Figure 11 shows that the MLL gene was rearranged 
in this llq2 3 translocation and that two rearranged 
fragments are evident, indicating the detection of 
sequences from both derivative chromosomes X and 11. 

2 5 To determine the nature of the MLL transcripts in 

this cell line, a Northern blot was hybridized 
sequentially to three different fragments of the 14P-18B 
cDNA clone. The fragments used were 0.8E (telomeric to 
the breakpoint), a 0.7B fragment (which spans the 

30 breakpoint) and finally a 1.5EB fragment (which is 

centromeric to the breakpoint), as shown in Fig. 2. All 
three fragments were found to show weak hybridization to 
the two normal sized MLL transcripts in all the cell 
lines (Fig. 12) . 

35 

The 0.7B and the 1.5EB fragments detected two 
additional transcripts, an abundant 8.0 kb transcript and 
a diffuse band around 6.0 kb in the Karpas 45 cell line, 
which were not present in the control cell lines (Fig. 
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12) . Furthermore, these two transcripts were not 
detected by the more telomeric 0.8E fragment (Fig. 12). 
Hybridization to actin indicated that there was 
approximately 50% less RNA in the Karpas 4 5 cell line 
5 lane compared to RNA in the control cell line (Fig. 12). 

It should be noted here that the two normal sized 
MLL transcripts, listed as being of about 15 and 13 
kilobases, are the same transcripts previously referred 

10 to as about 12 and about 11.5 kb throughout the earlier 
examples. This illustrates the fact that the studies 
shown in Fig. 12 were conducted at a later date and that, 
as mentioned before, the earlier Northern blot size 
determinations were generally approximations, as is well 

15 known to result from using this method to determine sizes 
of greater than about 9 or 10 kb. However, this study of 
the Karpas cell line further exemplifies the utility of 
the probes in differentiating between normal and leukemic 
cells. 

20 

The present study further supports the inventors' 
findings that the breakpoint cluster region in the MLL 
gene occurs within a 9.0 kilobase BamHl genomic fragment. 
On Northern analysis all three of the cDNA fragments 

2 5 detected the normal-sized MLL transcripts in the control 

cell lines, and to a lesser extent in the Karpas 45 cell 
line. However, the 0.7B and the 1.5EB fragments, which 
span and are centromeric to the breakpoint junction 
respectively, detected two additional altered transcripts 

3 0 of the MLL gene in the Karpas 4 5 cell line. As the more 

telomeric 0.8E fragment did not hybridize to these two 
novel transcripts, it may concluded that these 
transcripts are altered MLL transcripts coming from the 
derivative 11 chromosome. 

35 

Evidence of any altered MLL transcripts derived from 
the reciprocal chromosome X was not found in the Karpas 
4 5 cell line. This is in keeping with the inventors' 
proposition that the derivative 11 chromosome contains 
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the critical junction in two and three way reciprocal 
translocations involving chromosome band llq23 and the 
associated rearrangement of the MLL gene. 



10 While the compositions and methods of this invention 

have been described in terms of preferred embodiments, it 
will be apparent to those of skill in the art that 
variations may be applied to the composition, methods and 
in the steps or in the sequence of steps of the method 

15 described herein without departing from the concept, 

spirit and scope of the invention. More specifically, it 
will be apparent that certain agents which are both 
chemically and physiologically related may be substituted 
for the agents described herein while the same or similar 

2 0 results would be achieved. All such similar substitutes 
and modifications apparent to those skilled in the art 
are deemed to be within the spirit, scope and concept of 
the invention as defined by the appended claims. All 
claimed matter and methods can be made and executed 

25 without undue experimentation. 
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CLAIMS 

1. A method for detecting leukemic cells containing 
5 llq23 chromosome translocations, comprising: 

(a) obtaining genomic DNA from cells suspected 
of containing a leukemia-associated 
chromosomal rearrangement at chromosome 

10 llq23; 

(b) digesting said DNA with one or more 
restriction enzymes; and 

15 (c) probing said digested DNA with a nucleic 

acid probe which includes a sequence in 
accordance with the sequence of a 0.7 kb 
BamHl fragment of cDNA clone 14P-18B. 



20 



2. The method of claim 1, wherein said DNA is digested 
with the single restriction enzyme BamHl. 



25 3. The method of claim 1, wherein the nucleic acid 

probe is the nucleic acid probe termed MLL 0.7B (seq id 
no: 1) . 



3 0 4. The method of claim 1, wherein the cells are 

obtained from a patient suspected of having a leukemia 
associated with a chromosomal rearrangement at chromosome 
llq23. 



35 
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5. A method for identifying an individual having a 
leukemia associated with an llq23 chromosome 
translocation, comprising digesting a genomic DNA sample 
obtained from said individual with the restriction enzyme 
5 BamHl and probing the digested DNA with a 0.7 kb BaraHl 
restriction fragment obtained from MLL DNA, wherein said 
0.7 kb fragment encompasses the breakpoints clustered in 
an 8.3 kb BamHl genomic region of the MLL gene. 



10 

6. The method of claim 5, wherein the 0.7 kb fragment 
is the fragment termed MLL 0.7B (seq id no:l). 



15 7. The method of claim 5, wherein the chromosome 11 

translocation in the 8.3 kb region of the MLL gene is a 
reciprocal translocation with chromosome 4, chromosome 6, 
chromosome 9, chromosome 19 or the X chromosome. 



20 

8. A method for detecting leukemic cells containing 
llq2 3 chromosome translocations, comprising: 



(a) obtaining mRNA from cells suspected of 
25 containing a leukemia-associated 

chromosomal rearrangement at chromosome 
llq23; and 

(b) probing said mRNA with a nucleic acid 

3 0 probe capable of identifying normal MLL 

gene transcripts and aberrant MLL gene 
transcripts, wherein a reduction in the 
amount of a normal MLL gene transcript or 
the presence of an aberrant MLL gene 

35 transcript is indicative of a cell 
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containing a llq2 3 chromosome 
translocation . 

5 9. The method of claim 8, wherein a reduction in the 
amount of a normal MLL gene transcript is characterized 
as a reduction in the amount of an MLL gene transcript of 
about 12.5 kb, about 12.0 kb or about 11.5 kb in length. 

10 

10. The method of claim 8, wherein the nucleic acid 
probe is fragment MLL 0.7B (seq id no:l), fragment MLL 
0.3BE (seq id no:2), fragment MLL 1.5EB (seq id no:3) or 
the cDNA clone 14-7 (seq id no:5). 

15 

11. The method of claim 8, wherein the nucleic acid 
probe is f luorescently labelled. 

20 

12. The method of claim 8, wherein the cells are' 
obtained from a patient suspected of having a leukemia 
associated with a chromosomal rearrangement at chromosome 
llq23. 

25 

13. A DNA segment, free from total genomic DNA, having a 
sequence in accordance with, or Complementary to, the 
sequence of fragment MLL 0.7B (seq id no:l), fragment MLL 

30 0.3BE (seq id no:2), fragment MLL 1.5EB (seq id no:3), 

cDNA clone 14P-18B (seq id no: 4) or cDNA clone 14-7 (seq 
id no: 5), derived from the MLL gene. 
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14. The DNA segment of claim 13, further defined as the 
fragment MLL 0.7B (seq id no:l). 



15. The DNA segment of claim 13, further defined as the 
fragment MLL 0.3BE (seq id no:2). 



16, The DNA segment of claim 13, further defined as the 
10 fragment MLL 1.5EB (seq id no: 3). 



17. The DNA segment of claim 13, further defined as the 
cDNA clone 14-7 (seq id no: 5). 

15 

18. A kit for use in the detection of leukemic cells 
containing llq2 3 chromosome translocations, comprising a 
first container which includes a nucleic acid probe which 

2 0 includes a sequence in accordance with the sequences of 

nucleic acid probes MLL 0.7B (seq id no:l), MLL 0.3BE 
(seq id no: 2), MLL 1.5EB.(seq id no: 3) or 14-7 (seq id 
no: 5) ; and a second container which comprises a nucleic 
acid probe for use as a control. 

25 

19. The kit of claim 18, wherein the first container 
includes the nucleic acid probe MLL 0.7B (seq id no:l), 
MLL 0.3BE (seq id no:2), MLL 1.5EB (seq id no : 3 ) or 14-7 

3 0 (seq id no: 5) . 

20. The kit of claim 19, wherein the first container 
includes the nucleic acid probes MLL Q.7B (seq id no:l), 
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MLL 0.3BE (seq id no:2), MLL 1.5EB (seq id no:3) and 14-7 
(seq id no: 5) . 

5 21. The kit of claim 18, further comprising a third 
container which includes a restriction enzyme. 

22. The kit of claim 21, wherein the first container 
10 includes the nucleic acid probe MLL 0.7B (seq id.norl) 
and the third container includes the restriction enzyme 
BamHl . 

15 23. The kit of claim 18, wherein the nucleic acid probe 
is f luorescently labelled. 

24. A protein including an MLL amino acid sequence 
20 purified relative to its natural state. 

25. The protein of claim 24, wherein the protein 
includes an MLL amino acid sequence telomeric to the 

25 breakpoint region. 

26. The protein of claim 25, wherein the protein 
includes an MLL amino acid sequence in accordance with 

30 seq id no: 8. 



35 



27. The protein of claim 24, wherein the protein 
includes an MLL amino acid sequence centromeric to the 
breakpoint region. 
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28. The protein of claim 27, wherein the protein 
includes an MLL amino acid sequence in accordance with 
amino acids 323-623 of seq id no: 7. 

5 

29. The protein of claim 27, wherein the protein s 
includes a zinc finger region. 

10 

30. An antibody having binding affinity for a protein 
including an MLL amino acid sequence. 

15 31. The antibody of claim 30, wherein the protein 

includes an MLL amino acid sequence centromeric to the 
breakpoint region, an MLL amino acid sequence telomeric 
to the breakpoint region or an MLL zinc finger region. 
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